That¹s good, and thanks for the sharing;

On 6/17/15, 12:50 AM, "alex schufo" <[email protected]> wrote:

>Thanks, that explains why it doesn't pick up the jar.
>
>Rebuilding directly through Maven did not work for me because when I
>chose hive-hcatalog 0.13.0, 0.13.1 or 0.14.0 I got exceptions because the
>versions are not the same (Error: Found interface
>org.apache.hadoop.mapreduce.JobContext, but class was expected and Error:
>java.io.IOException: Deserialization error:
>org.apache.hadoop.hive.metastore.api.Table; local class incompatible:
>stream classdesc serialVersionUID = -946662244473213550, local class
>serialVersionUID = 398473631015277182) but I can confirm that the MR can
>find the HCatalog classes that way.
>
>To solve it I took the jar version that I was expecting and built the
>Kylin
>job jar myself:
>
>mkdir tmp
>cd tmp
>jar -xf ../hive-hcatalog-core-0.13.0.2.1.7.0-784.jar
>jar -xf ../kylin-job-0.7.1-incubating-job.jar
>cd ..
>jar -cvf combined.jar -C tmp .
>
>Then renamed combined.jar, placed it in Kylin lib and restarted Kylin.
>It works but I then also needed to add the hive exec and hive metastore
>jars to the Kylin job jar:
>
>jar -xf ../hive-exec-0.13.0.2.1.7.0-784.jar
>jar -xf ../hive-metastore-0.13.0.2.1.7.0-784.jar
>
>Thank you for your help.
>
>On Tue, Jun 16, 2015 at 3:36 AM, ShaoFeng Shi <[email protected]>
>wrote:
>
>> Kylin doesn't treat HCatalog as a third-party jar; It assumes the hive
>> libraries is a part of hadoop cluster, just like common hadoop libs, and
>> the nodes in cluster are identical;
>> If you couldn't install it in your hadoop cluster, a possible way is to
>> embed HCatalog classes in Kylin's job jar; The job jar will be
>>submitted to
>> all working nodes as a third-party lib; We didn't verify this but you
>>can
>> have a try:
>>
>> 1. Checkout Kylin code repository from
>> https://github.com/apache/incubator-kylin.git, use the master branch;
>> 2. Find the dependency clarification of hcatalog in kylin-job module,
>> remove "<scope>provided</scope>" to use default scope:
>> https://github.com/apache/incubator-kylin/blob/master/job/pom.xml#L210
>>
>> 2. Run "mvn package -DskipTests" under the Kylin project folder, to
>> re-package the jars;
>> 3. Check the new job/target/kylin-job-0.7.1-incubating-job.jar, it
>>should
>> include HCatalog classes;
>> 4. Copy and rename this jar to your Kylin installation in
>>$KYLIN_HOME/lib/,
>> to overwrite the old one (backup old jar to other folder);
>> 5. Restart Kylin and then resume the fail job, to see whether the
>> ClassNotFound error still there;
>>
>> If it works, please let us know;
>>
>> 2015-06-15 22:33 GMT+08:00 alex schufo <[email protected]>:
>>
>> > I suspect that the HCatalog jar is not on the Hadoop nodes, or in a
>> > different location, but I am not the Hadoop administrator so I am not
>> > allowed to modify that.
>> >
>> > I was reading this article:
>> >
>> >
>> 
>>http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-librarie
>>s-in-your-map-reduce-job/
>> > and my understanding was that by specifying the third party jar when
>> > launching the MR job it would be made available to the Hadoop nodes. I
>> > thought that the RunJar command in bin/kylin.sh was doing something
>> > similar.
>> >
>> > Also this article mentions that installing the jars on the cluster
>>nodes
>> > is deprecated.
>> >
>> > On Mon, Jun 15, 2015 at 2:58 PM, ShaoFeng Shi <[email protected]>
>> > wrote:
>> >
>> > > is Hive/hcatalog installed on all hadoop nodes, with the same
>>location?
>> > >
>> > > 2015-06-15 19:10 GMT+08:00 alex schufo <[email protected]>:
>> > >
>> > > > Hello, I installed Kylin on a new Hadoop cluster.
>> > > >
>> > > > On the Kylin instance HCatalog is found at
>> > > >
>> > > >
>> > >
>> >
>> 
>>/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.7.0-7
>>84.jar
>> > > > and I don't get any error while running
>>bin/find-hive-dependency.sh
>> > (see
>> > > > full output below).
>> > > >
>> > > > However when I build a cube the Extract Fact Table Distinct
>>Columns
>> > step
>> > > > fails because the MR cannot find the HCat dependency. There is no
>> > > exception
>> > > > in tomcat/logs/kylin.log
>> > > >
>> > > > Just this :
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:05,501][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.check
>>Status(HadoopStatusChecker.java:57)]
>> > > > - *State of Hadoop job:
>>job_1430752988188_1332267:RUNNING-UNDEFINED*
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:05,505][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu
>>tResource(ResourceStore.java:171)]
>> > > > - Saving resource
>> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
>> > > > (Store kylin_metadata@hbase)
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,515][WARN][org.apache.commons.httpclient.HttpMethodBase.getRespo
>>nseBody(HttpMethodBase.java:682)]
>> > > > - Going to buffer response body of large or unknown size. Using
>> > > > getResponseBodyAsStream instead is recommended.
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusGetter.getHtt
>>pResponse(HadoopStatusGetter.java:90)]
>> > > > - Job job_1430752988188_1332267 get status check result.
>> > > >
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.check
>>Status(HadoopStatusChecker.java:57)]
>> > > > - *State of Hadoop job: job_1430752988188_1332267:FINISHED-FAILED*
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,520][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu
>>tResource(ResourceStore.java:171)]
>> > > > - Saving resource
>> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
>> > > > (Store kylin_metadata@hbase)
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,704][WARN][org.apache.kylin.job.common.HadoopCmdOutput.updateJob
>>Counter(HadoopCmdOutput.java:89)]
>> > > > - no counters for job job_1430752988188_1332267
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,708][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu
>>tResource(ResourceStore.java:171)]
>> > > > - Saving resource
>> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
>> > > > (Store kylin_metadata@hbase)
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,715][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu
>>tResource(ResourceStore.java:171)]
>> > > > - Saving resource
>> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
>> > > > (Store kylin_metadata@hbase)
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,733][DEBUG][org.apache.kylin.common.persistence.ResourceStore.pu
>>tResource(ResourceStore.java:171)]
>> > > > - Saving resource
>> > /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
>> > > > (Store kylin_metadata@hbase)
>> > > >
>> > > > [pool-7-thread-3]:[2015-06-15
>> > > >
>> > > >
>> > >
>> >
>> 
>>03:10:15,736][INFO][org.apache.kylin.job.manager.ExecutableManager.update
>>JobOutput(ExecutableManager.java:222)]
>> > > > - *job id:0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 from RUNNING to
>> > ERROR*
>> > > > On the Hadoop node we can see that the MR job fails because
>>HCatalog
>> > was
>> > > > not found:
>> > > >
>> > > > Error: java.lang.RuntimeException:
>>java.lang.ClassNotFoundException:
>> > > Class
>> > > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
>> > > >
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1961)
>> > at
>> > > >
>> > > >
>> > >
>> >
>> 
>>org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobCo
>>ntextImpl.java:174)
>> > > > at 
>>org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726) at
>> > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at
>> > > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at
>> > > > java.security.AccessController.doPrivileged(Native Method) at
>> > > > javax.security.auth.Subject.doAs(Subject.java:415) at
>> > > >
>> > > >
>> > >
>> >
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1594)
>> > > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>Caused
>> > by:
>> > > > java.lang.ClassNotFoundException: Class
>> > > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
>> > > >
>> > > >
>> > >
>> >
>> 
>>org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:18
>>67)
>> > > > at
>> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1959)
>> > > > ... 8 more
>> > > >
>> > > > $ bin/find-hive-dependency.sh
>> > > >
>> > > >
>> > > > Logging initialized using configuration in
>> > > > file:/etc/hive/conf.dist/hive-log4j.properties
>> > > >
>> > > > SLF4J: Class path contains multiple SLF4J bindings.
>> > > >
>> > > > SLF4J: Found binding in
>> > > >
>> > > >
>> > >
>> >
>> 
>>[jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/Sta
>>ticLoggerBinder.class]
>> > > >
>> > > > SLF4J: Found binding in
>> > > >
>> > > >
>> > >
>> >
>> 
>>[jar:file:/opt/edw/hive/auxlib/hive-udfs.jar!/org/slf4j/impl/StaticLogger
>>Binder.class]
>> > > >
>> > > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for
>>an
>> > > > explanation.
>> > > >
>> > > > SLF4J: Actual binding is of type
>>[org.slf4j.impl.Log4jLoggerFactory]
>> > > >
>> > > > hive dependency:
>> > > >
>> > > >
>> > >
>> >
>> 
>>/etc/hive/conf:/usr/lib/hive/lib/hive-serde-0.13.0.2.1.7.0-784.jar:/usr/l
>>ib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/asm-commons-3.1.jar:/u
>>sr/lib/hive/lib/jdo-api-3.0.1.jar:/usr/lib/hive/lib/derbyclient-10.10.1.1
>>.jar:/usr/lib/hive/lib/antlr-runtime-3.4.jar:/usr/lib/hive/lib/geronimo-j
>>aspic_1.0_spec-1.0.jar:/usr/lib/hive/lib/hive-service-0.13.0.2.1.7.0-784.
>>jar:/usr/lib/hive/lib/hive-common-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/li
>>b/geronimo-jta_1.1_spec-1.1.1.jar:/usr/lib/hive/lib/hive-shims-common-sec
>>ure-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/derbynet-10.10.1.1.jar:/usr/
>>lib/hive/lib/httpcore-4.2.5.jar:/usr/lib/hive/lib/jpam-1.1.jar:/usr/lib/h
>>ive/lib/hive-exec-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-metastore
>>-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/commons-httpclient-3.0.1.jar:/u
>>sr/lib/hive/lib/velocity-1.5.jar:/usr/lib/hive/lib/guava-11.0.2.jar:/usr/
>>lib/hive/lib/eigenbase-xom-1.3.4.jar:/usr/lib/hive/lib/commons-compiler-2
>>.7.3.jar:/usr/lib/hive/lib/libfb303-0.9.0.jar:/usr/lib/hive/lib/commons-p
>>ool-1.5.4.jar:/usr/lib/hive/lib/libthrift-0.9.0.jar:/usr/lib/hive/lib/avr
>>o-1.7.5.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/hive-
>>shims-common.jar:/usr/lib/hive/lib/stax-api-1.0.1.jar:/usr/lib/hive/lib/h
>>ive-shims-0.20-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-cli-0.13.0.2
>>.1.7.0-784.jar:/usr/lib/hive/lib/oro-2.0.8.jar:/usr/lib/hive/lib/eigenbas
>>e-properties-1.1.4.jar:/usr/lib/hive/lib/hive-ant.jar:/usr/lib/hive/lib/z
>>ookeeper-3.4.5.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-hwi-0.13.0.2.1.7.0-
>>784.jar:/usr/lib/hive/lib/commons-codec-1.4.jar:/usr/lib/hive/lib/mail-1.
>>4.1.jar:/usr/lib/hive/lib/hive-shims-common-secure.jar:/usr/lib/hive/lib/
>>servlet-api-2.5.jar:/usr/lib/hive/lib/optiq-core-0.5.jar:/usr/lib/hive/li
>>b/ST4-4.0.4.jar:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/
>>hive/lib/hive-common.jar:/usr/lib/hive/lib/httpclient-4.2.5.jar:/usr/lib/
>>hive/lib/hive-hbase-handler-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive
>>-jdbc-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-serde.jar:/usr/lib/hi
>>ve/lib/derby-10.10.1.1.jar:/usr/lib/hive/lib/hive-hwi.jar:/usr/lib/hive/l
>>ib/optiq-avatica-0.5.jar:/usr/lib/hive/lib/hive-exec.jar:/usr/lib/hive/li
>>b/hive-contrib.jar:/usr/lib/hive/lib/hive-contrib-0.13.0.2.1.7.0-784.jar:
>>/usr/lib/hive/lib/hive-shims.jar:/usr/lib/hive/lib/junit-4.10.jar:/usr/li
>>b/hive/lib/jta-1.1.jar:/usr/lib/hive/lib/hive-jdbc.jar:/usr/lib/hive/lib/
>>hive-ant-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-shims-0.13.0.2.1.7
>>.0-784.jar:/usr/lib/hive/lib/hive-testutils-0.13.0.2.1.7.0-784.jar:/usr/l
>>ib/hive/lib/antlr-2.7.7.jar:/usr/lib/hive/lib/hive-shims-0.23-0.13.0.2.1.
>>7.0-784.jar:/usr/lib/hive/lib/hive-testutils.jar:/usr/lib/hive/lib/xz-1.0
>>.jar:/usr/lib/hive/lib/commons-collections-3.1.jar:/usr/lib/hive/lib/hive
>>-metastore.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/p
>>aranamer-2.3.jar:/usr/lib/hive/lib/jetty-all-7.6.0.v20120127.jar:/usr/lib
>>/hive/lib/commons-compress-1.4.1.jar:/usr/lib/hive/lib/asm-tree-3.1.jar:/
>>usr/lib/hive/lib/hive-cli.jar:/usr/lib/hive/lib/hive-beeline-0.13.0.2.1.7
>>.0-784.jar:/usr/lib/hive/lib/janino-2.7.3.jar:/usr/lib/hive/lib/hive-shim
>>s-0.20S-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/groovy-all-2.1.6.jar:/us
>>r/lib/hive/lib/hive-service.jar:/usr/lib/hive/lib/hive-shims-common-0.13.
>>0.2.1.7.0-784.jar:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/
>>hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar:/
>>usr/lib/hive/lib/ant-launcher-1.9.1.jar:/usr/lib/hive/lib/ant-1.9.1.jar:/
>>usr/lib/hive/lib/hamcrest-core-1.1.jar:/usr/lib/hive/lib/snappy-java-1.0.
>>5.jar:/usr/lib/hive/lib/stringtemplate-3.2.1.jar:/usr/lib/hive/lib/common
>>s-io-2.4.jar:/usr/lib/hive/lib/hive-hbase-handler.jar:/usr/lib/hive/lib/s
>>ervlet-api-2.5-20081211.jar:/usr/lib/hive/lib/tempus-fugit-1.1.jar:/usr/l
>>ib/hive/lib/linq4j-0.1.13.jar:/usr/lib/hive/lib/geronimo-annotation_1.0_s
>>pec-1.1.1.jar:/usr/lib/hive/lib/jetty-6.1.26.jar:/usr/lib/hive/lib/jetty-
>>util-6.1.26.jar:/usr/lib/hive/lib/bonecp-0.8.0.RELEASE.jar:/usr/lib/hive/
>>lib/hive-beeline.jar:/usr/lib/hive/lib/jsr305-1.3.9.jar:/usr/lib/hive/lib
>>/activation-1.1.jar:/usr/lib/hive/lib/log4j-1.2.16.jar:/usr/lib/hive/lib/
>>commons-logging-1.1.3.jar:/usr/lib/hive-hcatalog/share/hcatalog/hive-hcat
>>alog-core-0.13.0.2.1.7.0-784.jar
>> > > >
>> > >
>> >
>>

Reply via email to