Hello Everybody,
Thanks for all the help/
I successfully build Kylin on HDP2.3 with the new updates to the branch and by
changing some dependencies in the pom file ( got a HCAT mapred version conflict
before ) When I changed the entries in the pom file to the exact version of
HDP2.3 it worked.
<!-- Hadoop versions -->
<hadoop2.version>2.7.1</hadoop2.version>
<yarn.version>2.7.1</yarn.version>
<zookeeper.version>3.4.6</zookeeper.version>
<hive.version>1.2.1</hive.version>
<hive-hcatalog.version>1.2.1</hive-hcatalog.version>
<hbase-hadoop2.version>1.1.1</hbase-hadoop2.version>
<curator.version>2.7.1</curator.version>
However now I get errors in the test and also when I create a new cube from
scratch during the Hfile creation.
Any idea what this could be? During the test run I thought it might be a bad
testcase but it also happens when I build a new cube from scratch.
Parameters:
-conf /kylin/kylin-1.1-incubating-SNAPSHOT/conf/kylin_job_conf.xml -cubename
aggtest -input
/kylin/kylin_metadata/kylin-0669bcc0-2a42-4189-a7c8-6abbb533da8c/aggtest/cuboid/*
-output
hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/kylin-0669bcc0-2a42-4189-a7c8-6abbb533da8c/aggtest/hfile
-htablename KYLIN_0PV292NH5B -jobname Kylin_HFile_Generator_aggtest_Step
Error message :
java.lang.IllegalArgumentException: Can not create a Path from a null string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:122)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at org.apache.hadoop.fs.Path.<init>(Path.java:88)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:591)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:440)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:405)
at
org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(HFileOutputFormat.java:91)
at
org.apache.kylin.job.hadoop.cube.CubeHFileJob.run(CubeHFileJob.java:86)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:113)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
result code:2
On 10/9/15, 1:11 PM, "Benjamin Leonhardi" <[email protected]> wrote:
>Hello Everybody,
>
>We figured out the 1.2 coded comes from the first dependency. Will try and
>update.
>
>Thanks for your help.
>
>Ben
>
>
>
>On 10/6/15, 12:47 AM, "周千昊" <[email protected]> wrote:
>
>>Hi, Benjamin,
>> It is a bug, a jira ticket is created as follow:
>>https://issues.apache.org/jira/browse/KYLIN-1059
>> and it should have been fixed. since I don't have the hdp 2.3 env
>>currently, it would be appreciated if you can help to verify with the
>>latest code(branch 1.x-HBase1.x)
>>
>>Luke Han <[email protected]>于2015年10月5日周一 下午8:35写道:
>>
>>> Hi Benjamin,
>>> Did you generate binary package by yourself with below instruction? or
>>> just compile war/jar from maven?
>>>
>>> http://kylin.incubator.apache.org/development/howto_package.html
>>> <http://kylin.incubator.apache.org/development/howto_package.html>
>>>
>>> Thanks.
>>>
>>>
>>> Best Regards!
>>> ---------------------
>>>
>>> Luke Han
>>>
>>> On Thu, Oct 1, 2015 at 1:30 PM, Benjamin Leonhardi <
>>> [email protected]> wrote:
>>>
>>> > Hello All,
>>> >
>>> > I try to get Kylin to run with HDP 2.3 sandbox image. Using the HBase1.x
>>> > branch.
>>> >
>>> > https://github.com/apache/incubator-kylin/tree/1.x-HBase1.x
>>> >
>>> > It compiles fine and the tests run through until they reach the job
>>> > creation. I followed the steps below.
>>> >
>>> > http://kylin.incubator.apache.org/development/dev_env.html
>>> >
>>> > When I run the tests he reaches the job part and then fails because he
>>> > misses a method in commons-codec.jar. The method is encodeBase64(byte[],
>>> > isChunk boolean, isUrlSafe boolean) and it is in commons-codec1.4+ but
>>> not
>>> > in older versions. I checked but my environment only has codec jars that
>>> > are newer.
>>> >
>>> > However when I build kylin he for some reason compiled an older codec
>>> into
>>> > the kylin-job-1.1-incubating-SNAPSHOT-job.jar. If I unzip this I see a
>>> > Base64 class that only has the method encodeBase64(byte[], isChunk
>>> boolean
>>> > ) but not the urlsafe method. ( so I suppose codec 1.3 or older ) I
>>> tried
>>> > to check the pom.xml because I thought I might find an older codec
>>> > dependency inside but did not find anything.
>>> >
>>> > I also have to say I am not the best Maven person so I might be
>>> > overlooking something easy. One question I would have is why he compiles
>>> > the code into the -job.jar at all.
>>> >
>>> >
>>> > Any help welcome
>>> >
>>> > Best regards,
>>> >
>>> > Benjamin
>>> >
>>> >
>>> > Starting:
>>> > Kylin_Fact_Distinct_Columns_test_kylin_cube_without_slr_empty_Step
>>> > L4J [2015-10-01
>>> > 03:51:24,166][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> append
>>> > job jar:
>>> > /root/kylin_build/incubator-kylin/job/../job/target/kylin-job-1.1-incu
>>> > bating-SNAPSHOT-job.jar
>>> > L4J [2015-10-01
>>> > 03:51:24,169][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> append
>>> > kylin.hive.dependency: null and kylin.hbase.dependency: null to
>>> > mapreduce.appli
>>> > cation.classpath
>>> > L4J [2015-10-01
>>> > 03:51:24,167][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> append
>>> > job jar:
>>> > /root/kylin_build/incubator-kylin/job/../job/target/kylin-job-1.1-incu
>>> > bating-SNAPSHOT-job.jar
>>> > L4J [2015-10-01
>>> > 03:51:24,170][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> append
>>> > kylin.hive.dependency: null and kylin.hbase.dependency: null to
>>> > mapreduce.appli
>>> > cation.classpath
>>> > L4J [2015-10-01
>>> > 03:51:24,170][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> Hadoop
>>> > job classpath is:
>>> >
>>> /tmp/kylin/*,$HADOOP_CONF_DIR,/usr/hdp/2.3.0.0-2530/hbase/lib/hbase-common.jar,/usr/hdp/current/hive-client/conf/,/usr/hdp/2.3.0.0-2530/hive/lib/hive-metastore.jar,/usr/hdp/2.3.0.0-2530/hive/lib/hive-exec.jar,/usr/hdp/2.3.0.0-2530/hive-hcatalog/share/hcatalog/*,$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.3.0.0-2530/hadoop/lib/hadoop-lzo-0.6.0.2.3.0.0-2530.jar:/etc/hadoop/conf/secure
>>> > L4J [2015-10-01
>>> > 03:51:24,169][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> Hadoop
>>> > job classpath is:
>>> >
>>> /tmp/kylin/*,$HADOOP_CONF_DIR,/usr/hdp/2.3.0.0-2530/hbase/lib/hbase-common.jar,/usr/hdp/current/hive-client/conf/,/usr/hdp/2.3.0.0-2530/hive/lib/hive-metastore.jar,/usr/hdp/2.3.0.0-2530/hive/lib/hive-exec.jar,/usr/hdp/2.3.0.0-2530/hive-hcatalog/share/hcatalog/*,$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/2.3.0.0-2530/hadoop/lib/hadoop-lzo-0.6.0.2.3.0.0-2530.jar:/etc/hadoop/conf/secure
>>> > L4J [2015-10-01 03:51:24,229][WARN][org.apache.hadoop.hive.conf.HiveConf]
>>> > - HiveConf of name hive.heapsize does not exist
>>> > L4J [2015-10-01 03:51:24,230][WARN][org.apache.hadoop.hive.conf.HiveConf]
>>> > - HiveConf of name hive.server2.enable.impersonation does not exist
>>> > L4J [2015-10-01 03:51:24,236][WARN][org.apache.hadoop.hive.conf.HiveConf]
>>> > - HiveConf of name hive.heapsize does not exist
>>> > L4J [2015-10-01 03:51:24,237][WARN][org.apache.hadoop.hive.conf.HiveConf]
>>> > - HiveConf of name hive.server2.enable.impersonation does not exist
>>> > L4J [2015-10-01
>>> > 03:51:24,265][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> > tempMetaFileString is : null
>>> > L4J [2015-10-01
>>> > 03:51:24,266][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob] -
>>> > tempMetaFileString is : null
>>> > L4J [2015-10-01
>>> > 03:51:24,267][ERROR][org.apache.kylin.job.execution.AbstractExecutable] -
>>> > error running Executable
>>> > java.lang.NoSuchMethodError:
>>> > org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.encodeBytes(HCatUtil.java:125)
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.serialize(HCatUtil.java:104)
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.getHiveConf(HCatUtil.java:585)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:100)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
>>> > at
>>> >
>>> org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:101)
>>> > at
>>> >
>>> org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:77)
>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>> > at
>>> >
>>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:113)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>>> > at
>>> >
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> > at java.lang.Thread.run(Thread.java:745)
>>> > L4J [2015-10-01
>>> > 03:51:24,267][ERROR][org.apache.kylin.job.execution.AbstractExecutable] -
>>> > error running Executable
>>> > java.lang.NoSuchMethodError:
>>> > org.apache.commons.codec.binary.Base64.encodeBase64([BZZ)[B
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.encodeBytes(HCatUtil.java:125)
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.serialize(HCatUtil.java:104)
>>> > at
>>> > org.apache.hive.hcatalog.common.HCatUtil.getHiveConf(HCatUtil.java:585)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:100)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95)
>>> > at
>>> >
>>> org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:51)
>>> > at
>>> >
>>> org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:101)
>>> > at
>>> >
>>> org.apache.kylin.job.hadoop.cube.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:77)
>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>> > at
>>> >
>>> org.apache.kylin.job.common.MapReduceExecutable.doWork(MapReduceExecutable.java:113)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
>>> > at
>>> >
>>> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
>>> > at
>>> >
>>> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> > at
>>> >
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >
>>> > at java.lang.Thread.run(Thread.java:745)
>>> >
>>>