Kylin doesn't treat HCatalog as a third-party jar; It assumes the hive
libraries is a part of hadoop cluster, just like common hadoop libs, and
the nodes in cluster are identical;
If you couldn't install it in your hadoop cluster, a possible way is to
embed HCatalog classes in Kylin's job jar; The job jar will be submitted to
all working nodes as a third-party lib; We didn't verify this but you can
have a try:

1. Checkout Kylin code repository from
https://github.com/apache/incubator-kylin.git, use the master branch;
2. Find the dependency clarification of hcatalog in kylin-job module,
remove "<scope>provided</scope>" to use default scope:
https://github.com/apache/incubator-kylin/blob/master/job/pom.xml#L210

2. Run "mvn package -DskipTests" under the Kylin project folder, to
re-package the jars;
3. Check the new job/target/kylin-job-0.7.1-incubating-job.jar, it should
include HCatalog classes;
4. Copy and rename this jar to your Kylin installation in $KYLIN_HOME/lib/,
to overwrite the old one (backup old jar to other folder);
5. Restart Kylin and then resume the fail job, to see whether the
ClassNotFound error still there;

If it works, please let us know;

2015-06-15 22:33 GMT+08:00 alex schufo <[email protected]>:

> I suspect that the HCatalog jar is not on the Hadoop nodes, or in a
> different location, but I am not the Hadoop administrator so I am not
> allowed to modify that.
>
> I was reading this article:
>
> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
> and my understanding was that by specifying the third party jar when
> launching the MR job it would be made available to the Hadoop nodes. I
> thought that the RunJar command in bin/kylin.sh was doing something
> similar.
>
> Also this article mentions that installing the jars on the cluster nodes
> is deprecated.
>
> On Mon, Jun 15, 2015 at 2:58 PM, ShaoFeng Shi <[email protected]>
> wrote:
>
> > is Hive/hcatalog installed on all hadoop nodes, with the same location?
> >
> > 2015-06-15 19:10 GMT+08:00 alex schufo <[email protected]>:
> >
> > > Hello, I installed Kylin on a new Hadoop cluster.
> > >
> > > On the Kylin instance HCatalog is found at
> > >
> > >
> >
> /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.7.0-784.jar
> > > and I don't get any error while running bin/find-hive-dependency.sh
> (see
> > > full output below).
> > >
> > > However when I build a cube the Extract Fact Table Distinct Columns
> step
> > > fails because the MR cannot find the HCat dependency. There is no
> > exception
> > > in tomcat/logs/kylin.log
> > >
> > > Just this :
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:05,501][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:57)]
> > > - *State of Hadoop job: job_1430752988188_1332267:RUNNING-UNDEFINED*
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:05,505][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:171)]
> > > - Saving resource
> /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
> > > (Store kylin_metadata@hbase)
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,515][WARN][org.apache.commons.httpclient.HttpMethodBase.getResponseBody(HttpMethodBase.java:682)]
> > > - Going to buffer response body of large or unknown size. Using
> > > getResponseBodyAsStream instead is recommended.
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusGetter.getHttpResponse(HadoopStatusGetter.java:90)]
> > > - Job job_1430752988188_1332267 get status check result.
> > >
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,516][DEBUG][org.apache.kylin.job.tools.HadoopStatusChecker.checkStatus(HadoopStatusChecker.java:57)]
> > > - *State of Hadoop job: job_1430752988188_1332267:FINISHED-FAILED*
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,520][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:171)]
> > > - Saving resource
> /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
> > > (Store kylin_metadata@hbase)
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,704][WARN][org.apache.kylin.job.common.HadoopCmdOutput.updateJobCounter(HadoopCmdOutput.java:89)]
> > > - no counters for job job_1430752988188_1332267
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,708][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:171)]
> > > - Saving resource
> /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
> > > (Store kylin_metadata@hbase)
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,715][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:171)]
> > > - Saving resource
> /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
> > > (Store kylin_metadata@hbase)
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,733][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:171)]
> > > - Saving resource
> /execute_output/0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01
> > > (Store kylin_metadata@hbase)
> > >
> > > [pool-7-thread-3]:[2015-06-15
> > >
> > >
> >
> 03:10:15,736][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:222)]
> > > - *job id:0c56071b-4460-4e87-9f8b-8ea1d525d3ec-01 from RUNNING to
> ERROR*
> > > On the Hadoop node we can see that the MR job fails because HCatalog
> was
> > > not found:
> > >
> > > Error: java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > Class
> > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
> > > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1961)
> at
> > >
> > >
> >
> org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
> > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726) at
> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at
> > > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at
> > > java.security.AccessController.doPrivileged(Native Method) at
> > > javax.security.auth.Subject.doAs(Subject.java:415) at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
> > > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused
> by:
> > > java.lang.ClassNotFoundException: Class
> > > org.apache.hive.hcatalog.mapreduce.HCatInputFormat not found at
> > >
> > >
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1867)
> > > at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1959)
> > > ... 8 more
> > >
> > > $ bin/find-hive-dependency.sh
> > >
> > >
> > > Logging initialized using configuration in
> > > file:/etc/hive/conf.dist/hive-log4j.properties
> > >
> > > SLF4J: Class path contains multiple SLF4J bindings.
> > >
> > > SLF4J: Found binding in
> > >
> > >
> >
> [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > >
> > > SLF4J: Found binding in
> > >
> > >
> >
> [jar:file:/opt/edw/hive/auxlib/hive-udfs.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> > >
> > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> > > explanation.
> > >
> > > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> > >
> > > hive dependency:
> > >
> > >
> >
> /etc/hive/conf:/usr/lib/hive/lib/hive-serde-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/asm-commons-3.1.jar:/usr/lib/hive/lib/jdo-api-3.0.1.jar:/usr/lib/hive/lib/derbyclient-10.10.1.1.jar:/usr/lib/hive/lib/antlr-runtime-3.4.jar:/usr/lib/hive/lib/geronimo-jaspic_1.0_spec-1.0.jar:/usr/lib/hive/lib/hive-service-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-common-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/geronimo-jta_1.1_spec-1.1.1.jar:/usr/lib/hive/lib/hive-shims-common-secure-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/derbynet-10.10.1.1.jar:/usr/lib/hive/lib/httpcore-4.2.5.jar:/usr/lib/hive/lib/jpam-1.1.jar:/usr/lib/hive/lib/hive-exec-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-metastore-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/commons-httpclient-3.0.1.jar:/usr/lib/hive/lib/velocity-1.5.jar:/usr/lib/hive/lib/guava-11.0.2.jar:/usr/lib/hive/lib/eigenbase-xom-1.3.4.jar:/usr/lib/hive/lib/commons-compiler-2.7.3.jar:/usr/lib/hive/lib/libfb303-0.9.0.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/libthrift-0.9.0.jar:/usr/lib/hive/lib/avro-1.7.5.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/hive-shims-common.jar:/usr/lib/hive/lib/stax-api-1.0.1.jar:/usr/lib/hive/lib/hive-shims-0.20-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-cli-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/oro-2.0.8.jar:/usr/lib/hive/lib/eigenbase-properties-1.1.4.jar:/usr/lib/hive/lib/hive-ant.jar:/usr/lib/hive/lib/zookeeper-3.4.5.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-hwi-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/commons-codec-1.4.jar:/usr/lib/hive/lib/mail-1.4.1.jar:/usr/lib/hive/lib/hive-shims-common-secure.jar:/usr/lib/hive/lib/servlet-api-2.5.jar:/usr/lib/hive/lib/optiq-core-0.5.jar:/usr/lib/hive/lib/ST4-4.0.4.jar:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/hive/lib/hive-common.jar:/usr/lib/hive/lib/httpclient-4.2.5.jar:/usr/lib/hive/lib/hive-hbase-handler-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-jdbc-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-serde.jar:/usr/lib/hive/lib/derby-10.10.1.1.jar:/usr/lib/hive/lib/hive-hwi.jar:/usr/lib/hive/lib/optiq-avatica-0.5.jar:/usr/lib/hive/lib/hive-exec.jar:/usr/lib/hive/lib/hive-contrib.jar:/usr/lib/hive/lib/hive-contrib-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-shims.jar:/usr/lib/hive/lib/junit-4.10.jar:/usr/lib/hive/lib/jta-1.1.jar:/usr/lib/hive/lib/hive-jdbc.jar:/usr/lib/hive/lib/hive-ant-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-shims-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-testutils-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/antlr-2.7.7.jar:/usr/lib/hive/lib/hive-shims-0.23-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/hive-testutils.jar:/usr/lib/hive/lib/xz-1.0.jar:/usr/lib/hive/lib/commons-collections-3.1.jar:/usr/lib/hive/lib/hive-metastore.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/paranamer-2.3.jar:/usr/lib/hive/lib/jetty-all-7.6.0.v20120127.jar:/usr/lib/hive/lib/commons-compress-1.4.1.jar:/usr/lib/hive/lib/asm-tree-3.1.jar:/usr/lib/hive/lib/hive-cli.jar:/usr/lib/hive/lib/hive-beeline-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/janino-2.7.3.jar:/usr/lib/hive/lib/hive-shims-0.20S-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/groovy-all-2.1.6.jar:/usr/lib/hive/lib/hive-service.jar:/usr/lib/hive/lib/hive-shims-common-0.13.0.2.1.7.0-784.jar:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar:/usr/lib/hive/lib/ant-launcher-1.9.1.jar:/usr/lib/hive/lib/ant-1.9.1.jar:/usr/lib/hive/lib/hamcrest-core-1.1.jar:/usr/lib/hive/lib/snappy-java-1.0.5.jar:/usr/lib/hive/lib/stringtemplate-3.2.1.jar:/usr/lib/hive/lib/commons-io-2.4.jar:/usr/lib/hive/lib/hive-hbase-handler.jar:/usr/lib/hive/lib/servlet-api-2.5-20081211.jar:/usr/lib/hive/lib/tempus-fugit-1.1.jar:/usr/lib/hive/lib/linq4j-0.1.13.jar:/usr/lib/hive/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/usr/lib/hive/lib/jetty-6.1.26.jar:/usr/lib/hive/lib/jetty-util-6.1.26.jar:/usr/lib/hive/lib/bonecp-0.8.0.RELEASE.jar:/usr/lib/hive/lib/hive-beeline.jar:/usr/lib/hive/lib/jsr305-1.3.9.jar:/usr/lib/hive/lib/activation-1.1.jar:/usr/lib/hive/lib/log4j-1.2.16.jar:/usr/lib/hive/lib/commons-logging-1.1.3.jar:/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.2.1.7.0-784.jar
> > >
> >
>

Reply via email to