I just doubt about when kylin set some other jars that map or reduce dependents to classpath such as json/apache-commons, Just a moment ago , I notice the difference between kylin-job-xxx.jar in $KYLIN_HOME/lib and kylin-job-xxx.jar in tomcat/webapps/kylin/WEB-INF/lib/, the former jar file is much bigger than the the latter one:
nrpt@classa-nrpt1:~/kylin-1.0-incubating$ ls -lah lib/kylin-job-1.0-incubating.jar -rw-r--r-- 1 nrpt netease 9.6M Sep 7 20:54 lib/kylin-job-1.0-incubating.jar nrpt@classa-nrpt1:~/kylin-1.0-incubating$ ls -lah tomcat/webapps/kylin/WEB-INF/lib/kylin-job-1.0-incubating.jar -rw-r--r-- 1 nrpt netease 325K Sep 8 2015 tomcat/webapps/kylin/WEB-INF/lib/kylin-job-1.0-incubating.jar with open the first one , I find it is the combination of all dependent jar files and it resolve my doubt... 2015-09-08 10:16 GMT+08:00 yu feng <[email protected]>: > I find those log like this : > [pool-5-thread-1]:[2015-09-07 > 20:58:41,746][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:166)] > - Hadoop job classpath is: > /home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/etc/hadoop:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/contrib/capacity-scheduler/*.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-lzo-0.4.20.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/modules/*.jar > ..................... > > > There are two line in this log . the first line is the hadoop default > classpath outputed by command 'mapred classpath' , and it ends with a new > line '\n', I think the character need to be deleted. > what's more , I check source code of setting job classpath : > > if (kylinHBaseDependency != null) { > // yarn classpath is comma separated > kylinHBaseDependency = kylinHBaseDependency.replace(":", ","); > classpath = classpath + "," + kylinHBaseDependency; > } > > if (kylinHiveDependency != null) { > // yarn classpath is comma separated > kylinHiveDependency = kylinHiveDependency.replace(":", ","); > classpath = classpath + "," + kylinHiveDependency; > } > > jobConf.set(MAP_REDUCE_CLASSPATH, classpath + "," + > kylinHiveDependency); > logger.info("Hadoop job classpath is: " + > job.getConfiguration().get(MAP_REDUCE_CLASSPATH)); > > it looks like we append kylinHiveDependency to classpath twice, I do not > know what it means.. > > Lastly, the property of > MAP_REDUCE_CLASSPATH(mapreduce.application.classpath) is setted up some > jar files located at local filesystem, Actually I do not know if those > files will be uploaded to HDFS just as the files added in property ' > tmpjars'. > > 2015-09-08 9:35 GMT+08:00 Shi, Shaofeng <[email protected]>: > >> Hi feng, your map reduce classpath might not be correctly configured; >> Kylin will read the ³mapreduce.application.classpath² from default job >> configuration; if not found that, it will run ³mapred classpath² command >> to get the classpath, and then append hive/hbase dependencies; Please >> check kylin.log to see whether the final classpath includes the jar for >> this missing class; >> >> The message in kylin.log is as below, you can search it: >> >> Hadoop job classpath is: >> >> >> On 9/7/15, 10:18 PM, "yu feng" <[email protected]> wrote: >> >> >After submit a mapreduce job, we get job status,But it tells the job >> >failed! we check this application on RM website(xxx:8088/cluster/apps), >> >we >> >find those log : >> > >> > Application application_1418904565842_3597024 failed 2 times due to AM >> >Container for appattempt_1418904565842_3597024_000002 exited with >> >exitCode: >> >1 due to: Exception from container-launch: >> >org.apache.hadoop.util.Shell$ExitCodeException: >> >at org.apache.hadoop.util.Shell.runCommand(Shell.java:464) >> >at org.apache.hadoop.util.Shell.run(Shell.java:379) >> >at >> >org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) >> >at >> >> >org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchCon >> >tainer(LinuxContainerExecutor.java:252) >> >at >> >> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai >> >nerLaunch.call(ContainerLaunch.java:283) >> >at >> >> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai >> >nerLaunch.call(ContainerLaunch.java:79) >> >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> >at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> >at >> >> >java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor. >> >java:895) >> >at >> >> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java >> >:918) >> >at java.lang.Thread.run(Thread.java:662) >> >main : command provided 1 >> > >> >we check yarn log with this command : yarn logs -applicationId >> >application_1418904565842_3597024,get those log : >> >Container: container_1418904565842_3597024_01_000001 on >> >hadoop88.photo.163.org_56708 >> >> >========================================================================== >> >============ >> >LogType: stderr >> >LogLength: 664 >> >Log Contents: >> >Exception in thread "main" java.lang.NoClassDefFoundError: >> >org/apache/hadoop/mapreduce/v2/app/MRAppMaster >> >Caused by: java.lang.ClassNotFoundException: >> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster >> >at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >> >at java.security.AccessController.doPrivileged(Native Method) >> >at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >> >at java.lang.ClassLoader.loadClass(ClassLoader.java:306) >> >at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >> >at java.lang.ClassLoader.loadClass(ClassLoader.java:247) >> >Could not find the main class: >> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster. Program will exit. >> > >> > >> >I thinks this means the task can not find the jar files, So I upload all >> >the jars that kylin dependent to HDFS, and before submit this job*(in >> >AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set "tmpjars" >> >to those files located on HDFS(this way can avoid uploading all files >> when >> >submit every mapreduce job). >> > >> >This measure works in kylin-0.7.2, I get the same error in kylin-1.0 and >> I >> >guess this measure will work in kylin-1.0 too, But I do not think this is >> >a >> >good idea, >> > >> >It will be highly appreciated if you have some good idea or some >> >suggestion. Thanks... >> >> >
