Hi lk, I had a test on using beeline to connect hive, and I can pass the "Redistribute Flat Hive Table" step successfully. Bellow is my configuration:
> kylin.source.hive.client=beeline > kylin.source.hive.beeline-shell=beeline > kylin.source.hive.beeline-params=-n root --hiveconf > hive.security.authorization.sqlstd.confwhitelist.append='mapreduce.job.*|dfs.*' > -u 'jdbc:hive2://cdh1 .cloudera.com:2181,cdh2.cloudera.com:2181, > cdh3.cloudera.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 > ' On Thu, Apr 18, 2019 at 3:19 PM lk_hadoop <[email protected]> wrote: > hi, Chao Long ,thanks for your reply, the first setp I can see logs about : > > > EOL > beeline -n hive -p hiveadmin --hiveconf > hive.security.authorization.sqlstd.confwhitelist.append='mapreduce.job.*|dfs.*' > -u > jdbc:hive2://"bdp-scm-04:2181,bdp-scm-03:2181,bdp-scm-05:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" > --hiveconf hive.merge.mapredfiles=false --hiveconf > hive.auto.convert.join=true --hiveconf dfs.replication=2 --hiveconf > hive.exec.compress.output=true --hiveconf > hive.auto.convert.join.noconditionaltask=true --hiveconf > mapreduce.job.split.metainfo.maxsize=-1 --hiveconf hive.merge.mapfiles=false > --hiveconf hive.auto.convert.join.noconditionaltask.size=100000000 --hiveconf > hive.stats.autogather=true -f > /tmp/cfadac57-d586-446b-a798-96a9c37e34b2.hql;ret_code=$?;rm -f > /tmp/cfadac57-d586-446b-a798-96a9c37e34b2.hql;exit $ret_code > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/home/devuser/bdp/env/hbase-1.2.0-cdh5.14.0/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/home/devuser/bdp/env/hadoop-2.6.0-cdh5.14.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > scan complete in 2ms > Connecting to > jdbc:hive2://bdp-scm-04:2181,bdp-scm-03:2181,bdp-scm-05:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 > 19/04/18 10:54:45 [main]: INFO jdbc.HiveConnection: Connected to > bdp-scm-06:10000 > Connected to: Apache Hive (version 1.1.0-cdh5.14.0) > Driver: Hive JDBC (version 1.1.0-cdh5.14.0) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://bdp-scm-04:2181,bdp-scm-03:21> USE mykylin; > INFO : Compiling > command(queryId=hive_20190418105454_f4e97998-7295-4c33-925f-654624f67c6c): > USE mykylin > INFO : Semantic Analysis Completed > INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) > INFO : Completed compiling > command(queryId=hive_20190418105454_f4e97998-7295-4c33-925f-654624f67c6c); > Time taken: 0.074 seconds > INFO : Executing > command(queryId=hive_20190418105454_f4e97998-7295-4c33-925f-654624f67c6c): > USE mykylin > INFO : Starting task [Stage-0:DDL] in serial mode > INFO : Completed executing > command(queryId=hive_20190418105454_f4e97998-7295-4c33-925f-654624f67c6c); > Time taken: 0.008 seconds > INFO : OK > No rows affected (0.122 seconds) > > > So, it's that mean first step also use beeline ? And I have tested that on > the SSH client , if not enclosed in double quotes ,hive can not parse the > URL right. > > > 2019-04-18 > ------------------------------ > lk_hadoop > ------------------------------ > > *发件人:*Chao Long <[email protected]> > *发送时间:*2019-04-18 15:08 > *主题:*Re: why kylin job failed when use beeline with zookeeper > *收件人:*"user"<[email protected]> > *抄送:* > > Hi lk, > First step use SSHClient to run "Create Hive Table" command, so I think > it will not use beeline to connect hive. > "Redistribute Flat Hive Table" step need to compute row count of flat > table, so it will use beeline to connect if you configured. > And I see the zookeeper connect string are enclosed in double quotes, > is that a right way? > > On Thu, Apr 18, 2019 at 11:04 AM lk_hadoop <[email protected]> wrote: > >> hi,all: >> I'm using kylin-2.6.1-bin-cdh57 , when I connect to hive with >> beelin : >> kylin.source.hive.beeline-params=-n hive -p hiveadmin --hiveconf >> hive.security.authorization.sqlstd.confwhitelist.append='mapreduce.job.*|dfs.*' >> -u >> jdbc:hive2://"bdp-scm-04:2181,bdp-scm-03:2181,bdp-scm-05:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" >> I got error at step : Redistribute Flat Hive Table >> >> >> java.lang.IllegalArgumentException: Illegal character in path at index 86: >> hive2://dummyhost:00000/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" >> at java.net.URI.create(URI.java:852) >> at org.apache.hive.jdbc.Utils.parseURL(Utils.java:302) >> at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:122) >> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) >> at java.sql.DriverManager.getConnection(DriverManager.java:664) >> at java.sql.DriverManager.getConnection(DriverManager.java:208) >> at >> org.apache.kylin.source.hive.BeelineHiveClient.init(BeelineHiveClient.java:72) >> at >> org.apache.kylin.source.hive.BeelineHiveClient.<init>(BeelineHiveClient.java:66) >> at >> org.apache.kylin.source.hive.HiveClientFactory.getHiveClient(HiveClientFactory.java:29) >> at >> org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.computeRowCount(RedistributeFlatHiveTableStep.java:40) >> at >> org.apache.kylin.source.hive.RedistributeFlatHiveTableStep.doWork(RedistributeFlatHiveTableStep.java:91) >> at >> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:166) >> at >> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) >> at >> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:166) >> at >> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: java.net.URISyntaxException: Illegal character in path at index >> 86: >> hive2://dummyhost:00000/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" >> at java.net.URI$Parser.fail(URI.java:2848) >> >> I don't know why , because I can pass the first step which also >> use the same JDBC URL. >> >> 2019-04-18 >> ------------------------------ >> lk_hadoop >> >
