[ https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024842#comment-17024842 ]
Kaige Liu commented on KYLIN-4362: ----------------------------------- Split-by column is missed in the generated sqoop command. Can you please share your table DDL to debug this issue? > Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop. > ------------------------------------------------------------------------------ > > Key: KYLIN-4362 > URL: https://issues.apache.org/jira/browse/KYLIN-4362 > Project: Kylin > Issue Type: Bug > Reporter: Sonu Singh > Priority: Blocker > Fix For: v3.0.0 > > > MR and SPark job are failing on HDP3.1 with below error: > -00 execute finished with exception > java.io.IOException: OS command error exit with return code: 1, error > message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo > imports will fail. > Please set $ACCUMULO_HOME to the root of your Accumulo installation. > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187 > Missing argument for option: split-by > The command is: > /usr/hdp/current/sqoop-client/bin/sqoop import > -Dorg.apache.sqoop.splitter.allow_text_splitter=true > -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ > /XX.XX.XX.XX:XX/XXXXX" --driver com.XXXX.XX.jdbc.Driver --username XXXXX > --password "XXXXXXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as > \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as > \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ > BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` > ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`XXXX\`.\`sales\` > \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir > hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701 > 2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query > "SELECT min(\`\`), max(\`\`) FROM \`XXXXXX\`.\`sales\` " --null-string '\\N' > --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4 > at > org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88) > at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43) > at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : > job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR > 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : > error running Executabl e: > CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - > cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING} > 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : > Servers in the cluster: [localhost:7070] > 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : > Announcing new bro > -- This message was sent by Atlassian Jira (v8.3.4#803005)