That¹s good, and thanks for the sharing; On 6/17/15, 12:50 AM, "alex schufo" <[email protected]> wrote:
>Thanks, that explains why it doesn't pick up the jar. > >Rebuilding directly through Maven did not work for me because when I >chose hive-hcatalog 0.13.0, 0.13.1 or 0.14.0 I got exceptions because the >versions are not the same (Error: Found interface >org.apache.hadoop.mapreduce.JobContext, but class was expected and Error: >java.io.IOException: Deserialization error: >org.apache.hadoop.hive.metastore.api.Table; local class incompatible: >stream classdesc serialVersionUID = -946662244473213550, local class >serialVersionUID = 398473631015277182) but I can confirm that the MR can >find the HCatalog classes that way. > >To solve it I took the jar version that I was expecting and built the >Kylin >job jar myself: > >mkdir tmp >cd tmp >jar -xf ../hive-hcatalog-core-0.13.0.2.1.7.0-784.jar >jar -xf ../kylin-job-0.7.1-incubating-job.jar >cd .. >jar -cvf combined.jar -C tmp . > >Then renamed combined.jar, placed it in Kylin lib and restarted Kylin. >It works but I then also needed to add the hive exec and hive metastore >jars to the Kylin job jar: > >jar -xf ../hive-exec-0.13.0.2.1.7.0-784.jar >jar -xf ../hive-metastore-0.13.0.2.1.7.0-784.jar > >Thank you for your help. > >On Tue, Jun 16, 2015 at 3:36 AM, ShaoFeng Shi <[email protected]> >wrote: > >> Kylin doesn't treat HCatalog as a third-party jar; It assumes the hive >> libraries is a part of hadoop cluster, just like common hadoop libs, and >> the nodes in cluster are identical; >> If you couldn't install it in your hadoop cluster, a possible way is to >> embed HCatalog classes in Kylin's job jar; The job jar will be >>submitted to >> all working nodes as a third-party lib; We didn't verify this but you >>can >> have a try: >> >> 1. Checkout Kylin code repository from >> https://github.com/apache/incubator-kylin.git, use the master branch; >> 2. Find the dependency clarification of hcatalog in kylin-job module, >> remove "<scope>provided</scope>" to use default scope: >> https://github.com/apache/incubator-kylin/blob/master/job/pom.xml#L210 >> >> 2. Run "mvn package -DskipTests" under the Kylin project folder, to >> re-package the jars; >> 3. Check the new job/target/kylin-job-0.7.1-incubating-job.jar, it >>should >> include HCatalog classes; >> 4. Copy and rename this jar to your Kylin installation in >>$KYLIN_HOME/lib/, >> to overwrite the old one (backup old jar to other folder); >> 5. Restart Kylin and then resume the fail job, to see whether the >> ClassNotFound error still there; >> >> If it works, please let us know; >> >> 2015-06-15 22:33 GMT+08:00 alex schufo <[email protected]>: >> >> > I suspect that the HCatalog jar is not on the Hadoop nodes, or in a >> > different location, but I am not the Hadoop administrator so I am not >> > allowed to modify that. >> > >> > I was reading this article: >> > >> > >> >>http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-librarie >>s-in-your-map-reduce-job/ >> > and my understanding was that by specifying the third party jar when >> > launching the MR job it would be made available to the Hadoop nodes. I >> > thought that the RunJar command in bin/kylin.sh was doing something >> > similar. >> > >> > Also this article mentions that installing the jars on the cluster >>nodes >> > is deprecated. >> > >> > On Mon, Jun 15, 2015 at 2:58 PM, ShaoFeng Shi <[email protected]> >> > wrote: >> > >> > > is Hive/hcatalog installed on all hadoop nodes, with the same >>location? >> > > >> > > 2015-06-15 19:10 GMT+08:00 alex schufo <[email protected]>: >> > > >> > > > Hello, I installed Kylin on a new Hadoop cluster. >> > > > >> > > > On the Kylin instance HCatalog is found at >> > > > >> > > > >> > > >> > >> >>/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.7.0-7 >>84.jar >> > > > and I don't get any error while running >>bin/find-hive-dependency.sh >> > (see >> > > > full output below). >> > > > >> > > > However when I build a cube the Extract Fact Table Distinct >>Columns >> > step >> > > > fails because the MR cannot find the HCat dependency. There is no >> > > exception >> > > > in tomcat/logs/kylin.log >> > > > >> > > > Just this : >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:05,501][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.check >>Status(HadoopStatusChecker.java:57)] >> > > > - *State of Hadoop job: >>job_1430752988188_1332267:RUNNING-UNDEFINED* >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:05,505][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu >>tResource(ResourceStore.java:171)] >> > > > - Saving resource >> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 >> > > > (Store kylin_metadata@hbase) >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,515][WARN][org.apache.commons.httpclient.HttpMethodBase.getRespo >>nseBody(HttpMethodBase.java:682)] >> > > > - Going to buffer response body of large or unknown size. Using >> > > > getResponseBodyAsStream instead is recommended. >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusGetter.getHtt >>pResponse(HadoopStatusGetter.java:90)] >> > > > - Job job_1430752988188_1332267 get status check result. >> > > > >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.check >>Status(HadoopStatusChecker.java:57)] >> > > > - *State of Hadoop job: job_1430752988188_1332267:FINISHED-FAILED* >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,520][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu >>tResource(ResourceStore.java:171)] >> > > > - Saving resource >> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 >> > > > (Store kylin_metadata@hbase) >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,704][WARN][org.apache.kylin.job.common.HadoopCmdOutput.updateJob >>Counter(HadoopCmdOutput.java:89)] >> > > > - no counters for job job_1430752988188_1332267 >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,708][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu >>tResource(ResourceStore.java:171)] >> > > > - Saving resource >> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 >> > > > (Store kylin_metadata@hbase) >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,715][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu >>tResource(ResourceStore.java:171)] >> > > > - Saving resource >> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 >> > > > (Store kylin_metadata@hbase) >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,733][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu >>tResource(ResourceStore.java:171)] >> > > > - Saving resource >> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 >> > > > (Store kylin_metadata@hbase) >> > > > >> > > > [pool-7-thread-3]:[2015-06-15 >> > > > >> > > > >> > > >> > >> >>03:10:15,736][INFO][org.apache.kylin.job.manager.ExecutableManager.update >>JobOutput(ExecutableManager.java:222)] >> > > > - *job id:0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 from RUNNING to >> > ERROR* >> > > > On the Hadoop node we can see that the MR job fails because >>HCatalog >> > was >> > > > not found: >> > > > >> > > > Error: java.lang.RuntimeException: >>java.lang.ClassNotFoundException: >> > > Class >> > > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at >> > > > >> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1961) >> > at >> > > > >> > > > >> > > >> > >> >>org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobCo >>ntextImpl.java:174) >> > > > at >>org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726) at >> > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at >> > > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at >> > > > java.security.AccessController.doPrivileged(Native Method) at >> > > > javax.security.auth.Subject.doAs(Subject.java:415) at >> > > > >> > > > >> > > >> > >> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation >>.java:1594) >> > > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) >>Caused >> > by: >> > > > java.lang.ClassNotFoundException: Class >> > > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at >> > > > >> > > > >> > > >> > >> >>org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:18 >>67) >> > > > at >> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1959) >> > > > ... 8 more >> > > > >> > > > $ bin/find-hive-dependency.sh >> > > > >> > > > >> > > > Logging initialized using configuration in >> > > > file:/etc/hive/conf.dist/hive-log4j.properties >> > > > >> > > > SLF4J: Class path contains multiple SLF4J bindings. >> > > > >> > > > SLF4J: Found binding in >> > > > >> > > > >> > > >> > >> >>[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/Sta >>ticLoggerBinder.class] >> > > > >> > > > SLF4J: Found binding in >> > > > >> > > > >> > > >> > >> >>[jar:file:/opt/edw/hive/auxlib/hive-udfs.jar!/org/slf4j/impl/StaticLogger >>Binder.class] >> > > > >> > > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for >>an >> > > > explanation. >> > > > >> > > > SLF4J: Actual binding is of type >>[org.slf4j.impl.Log4jLoggerFactory] >> > > > >> > > > hive dependency: >> > > > >> > > > >> > > >> > >> >>/etc/hive/conf:/usr/lib/hive/lib/hive-serde-0.13.0.2.1.7.0-784.jar:/usr/l >>ib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/asm-commons-3.1.jar:/u >>sr/lib/hive/lib/jdo-api-3.0.1.jar:/usr/lib/hive/lib/derbyclient-10.10.1.1 >>.jar:/usr/lib/hive/lib/antlr-runtime-3.4.jar:/usr/lib/hive/lib/geronimo-j >>aspic_1.0_spec-1.0.jar:/usr/lib/hive/lib/hive-service-0.13.0.2.1.7.0-784. >>jar:/usr/lib/hive/lib/hive-common-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/li >>b/geronimo-jta_1.1_spec-1.1.1.jar:/usr/lib/hive/lib/hive-shims-common-sec >>ure-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/derbynet-10.10.1.1.jar:/usr/ >>lib/hive/lib/httpcore-4.2.5.jar:/usr/lib/hive/lib/jpam-1.1.jar:/usr/lib/h >>ive/lib/hive-exec-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-metastore >>-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/commons-httpclient-3.0.1.jar:/u >>sr/lib/hive/lib/velocity-1.5.jar:/usr/lib/hive/lib/guava-11.0.2.jar:/usr/ >>lib/hive/lib/eigenbase-xom-1.3.4.jar:/usr/lib/hive/lib/commons-compiler-2 >>.7.3.jar:/usr/lib/hive/lib/libfb303-0.9.0.jar:/usr/lib/hive/lib/commons-p >>ool-1.5.4.jar:/usr/lib/hive/lib/libthrift-0.9.0.jar:/usr/lib/hive/lib/avr >>o-1.7.5.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/hive- >>shims-common.jar:/usr/lib/hive/lib/stax-api-1.0.1.jar:/usr/lib/hive/lib/h >>ive-shims-0.20-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-cli-0.13.0.2 >>.1.7.0-784.jar:/usr/lib/hive/lib/oro-2.0.8.jar:/usr/lib/hive/lib/eigenbas >>e-properties-1.1.4.jar:/usr/lib/hive/lib/hive-ant.jar:/usr/lib/hive/lib/z >>ookeeper-3.4.5.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-hwi-0.13.0.2.1.7.0- >>784.jar:/usr/lib/hive/lib/commons-codec-1.4.jar:/usr/lib/hive/lib/mail-1. >>4.1.jar:/usr/lib/hive/lib/hive-shims-common-secure.jar:/usr/lib/hive/lib/ >>servlet-api-2.5.jar:/usr/lib/hive/lib/optiq-core-0.5.jar:/usr/lib/hive/li >>b/ST4-4.0.4.jar:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/ >>hive/lib/hive-common.jar:/usr/lib/hive/lib/httpclient-4.2.5.jar:/usr/lib/ >>hive/lib/hive-hbase-handler-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive >>-jdbc-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-serde.jar:/usr/lib/hi >>ve/lib/derby-10.10.1.1.jar:/usr/lib/hive/lib/hive-hwi.jar:/usr/lib/hive/l >>ib/optiq-avatica-0.5.jar:/usr/lib/hive/lib/hive-exec.jar:/usr/lib/hive/li >>b/hive-contrib.jar:/usr/lib/hive/lib/hive-contrib-0.13.0.2.1.7.0-784.jar: >>/usr/lib/hive/lib/hive-shims.jar:/usr/lib/hive/lib/junit-4.10.jar:/usr/li >>b/hive/lib/jta-1.1.jar:/usr/lib/hive/lib/hive-jdbc.jar:/usr/lib/hive/lib/ >>hive-ant-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-shims-0.13.0.2.1.7 >>.0-784.jar:/usr/lib/hive/lib/hive-testutils-0.13.0.2.1.7.0-784.jar:/usr/l >>ib/hive/lib/antlr-2.7.7.jar:/usr/lib/hive/lib/hive-shims-0.23-0.13.0.2.1. >>7.0-784.jar:/usr/lib/hive/lib/hive-testutils.jar:/usr/lib/hive/lib/xz-1.0 >>.jar:/usr/lib/hive/lib/commons-collections-3.1.jar:/usr/lib/hive/lib/hive >>-metastore.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/p >>aranamer-2.3.jar:/usr/lib/hive/lib/jetty-all-7.6.0.v20120127.jar:/usr/lib >>/hive/lib/commons-compress-1.4.1.jar:/usr/lib/hive/lib/asm-tree-3.1.jar:/ >>usr/lib/hive/lib/hive-cli.jar:/usr/lib/hive/lib/hive-beeline-0.13.0.2.1.7 >>.0-784.jar:/usr/lib/hive/lib/janino-2.7.3.jar:/usr/lib/hive/lib/hive-shim >>s-0.20S-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/groovy-all-2.1.6.jar:/us >>r/lib/hive/lib/hive-service.jar:/usr/lib/hive/lib/hive-shims-common-0.13. >>0.2.1.7.0-784.jar:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/ >>hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar:/ >>usr/lib/hive/lib/ant-launcher-1.9.1.jar:/usr/lib/hive/lib/ant-1.9.1.jar:/ >>usr/lib/hive/lib/hamcrest-core-1.1.jar:/usr/lib/hive/lib/snappy-java-1.0. >>5.jar:/usr/lib/hive/lib/stringtemplate-3.2.1.jar:/usr/lib/hive/lib/common >>s-io-2.4.jar:/usr/lib/hive/lib/hive-hbase-handler.jar:/usr/lib/hive/lib/s >>ervlet-api-2.5-20081211.jar:/usr/lib/hive/lib/tempus-fugit-1.1.jar:/usr/l >>ib/hive/lib/linq4j-0.1.13.jar:/usr/lib/hive/lib/geronimo-annotation_1.0_s >>pec-1.1.1.jar:/usr/lib/hive/lib/jetty-6.1.26.jar:/usr/lib/hive/lib/jetty- >>util-6.1.26.jar:/usr/lib/hive/lib/bonecp-0.8.0.RELEASE.jar:/usr/lib/hive/ >>lib/hive-beeline.jar:/usr/lib/hive/lib/jsr305-1.3.9.jar:/usr/lib/hive/lib >>/activation-1.1.jar:/usr/lib/hive/lib/log4j-1.2.16.jar:/usr/lib/hive/lib/ >>commons-logging-1.1.3.jar:/usr/lib/hive-hcatalog/share/hcatalog/hive-hcat >>alog-core-0.13.0.2.1.7.0-784.jar >> > > > >> > > >> > >>
