Hi Andrey, Firstly, thank you for your testing on our build, I have some question to ask:
1. When you set kylin.source.default=16, you said you found “Oops… Failed to take action.”, did you see what the exception kylin throw? Could you please show us error message in kylin.log? Our patch work when kylin.source.default=16, so the error message throw by kylin when you set it to 8 is not what we care in this issue/PR. So the important things is what occurred when you see “Oops… Failed to take action.” 2. If you could provided more detail about you related config, maybe I can find something useful. ---------------- Best wishes, Xiaoxiang Yu 发件人: Andrey Molotov <[email protected]> 日期: 2019年11月21日 星期四 16:19 收件人: Xiaoxiang Yu <[email protected]> 抄送: "[email protected]" <[email protected]> 主题: Re: Kylin to PostgreSQL Error in Cube build Step 1 Hello, Sir. I’ve installed the Kylin binary you’ve provided. Also I’ve prepared data tables that you used to test you build https://github.com/apache/kylin/pull/902 . If I set a property kylin.source.default=16 and click on Load Table Metadata From Tree, I got an error: “Oops… Failed to take action.” So, I was forced to use kylin.source.default=8. I prepared model and cube just like you, but still got the error on the first step My env: • PostgreSQL 9.5.20 • cdh 5.16.2 • Kylin build from master branch Here is log: java.io.IOException: OS command error exit with return code: 1, error message: Warning: /home/hadoop/sqoop/../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /home/hadoop/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /home/hadoop/sqoop/../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-11-21 10:51:09,835 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2019-11-21 10:51:09,872 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2019-11-21 10:51:09,982 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 2019-11-21 10:51:09,997 INFO manager.SqlManager: Using default fetchSize of 1000 2019-11-21 10:51:09,997 INFO tool.CodeGenTool: Beginning code generation 2019-11-21 10:51:10,443 INFO manager.SqlManager: Executing SQL statement: SELECT `FILM_PLAY`.`AUDIENCE_ID` as `FILM_PLAY_AUDIENCE_ID` ,`FILM_PLAY`.`FILM_ID` as `FILM_PLAY_FILM_ID` ,`FILM_PLAY`.`WATCH_TIME` ,`FILM_PLAY`.`PAYMENT` as `FILM_PLAY_PAYMENT` FROM `SC1`.`FILM_PLAY` as `FILM_PLAY` INNER JOIN `PUBLIC`.`FILM` as `FILM` ON `FILM_PLAY`.`FILM_ID` = `FILM`.`FILM_ID` INNER JOIN `SC2`.`AUDIENCE` as `AUDIENCE` ON `FILM_PLAY`.`AUDIENCE_ID` = `AUDIENCE`.`AUDIENCE_ID` WHERE 1=1 AND (`FILM_PLAY`.`WATCH_TIME` >= '2017-01-01 00:00:00' AND `FILM_PLAY`.`WATCH_TIME` < '2017-12-01 00:00:00') AND (1 = 0) 2019-11-21 10:51:10,454 ERROR manager.SqlManager: Error executing statement: org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 19 org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 19 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424) at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:161) at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:114) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:777) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:786) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:289) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:260) at org.apache.sqoop.manager.SqlManager.getColumnTypesForQuery(SqlManager.java:253) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:336) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1872) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1671) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) 2019-11-21 10:51:10,456 ERROR tool.ImportTool: Import failed: java.io.IOException: No columns to generate for ClassWriter at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1677) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:106) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:501) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) The command is: /home/hadoop/sqoop/bin/sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true -Dmapreduce.job.queuename=default --connect "jdbc:postgresql://172.0.0.66:54321/test_db" --driver org.postgresql.Driver --username user --password pass --query "SELECT \`FILM_PLAY\`.\`AUDIENCE_ID\` as \`FILM_PLAY_AUDIENCE_ID\` ,\`FILM_PLAY\`.\`FILM_ID\` as \`FILM_PLAY_FILM_ID\` ,\`FILM_PLAY\`.\`WATCH_TIME\` ,\`FILM_PLAY\`.\`PAYMENT\` as \`FILM_PLAY_PAYMENT\` FROM \`SC1\`.\`FILM_PLAY\` as \`FILM_PLAY\` INNER JOIN \`PUBLIC\`.\`FILM\` as \`FILM\` ON \`FILM_PLAY\`.\`FILM_ID\` = \`FILM\`.\`FILM_ID\` INNER JOIN \`SC2\`.\`AUDIENCE\` as \`AUDIENCE\` ON \`FILM_PLAY\`.\`AUDIENCE_ID\` = \`AUDIENCE\`.\`AUDIENCE_ID\` WHERE 1=1 AND (\`FILM_PLAY\`.\`WATCH_TIME\` >= '2017-01-01 00:00:00' AND \`FILM_PLAY\`.\`WATCH_TIME\` < '2017-12-01 00:00:00') AND \$CONDITIONS" --target-dir hdfs://localhost:9000/kylin/kylin_metadata/kylin-f2a51db2-9e2c-42e5-56eb-3e318176a2f6/kylin_intermediate_git_7ab881cd_589d_0111_d7fb_c15e3cf066c5 --split-by \`FILM_PLAY\`.\`WATCH_TIME\` --boundary-query "SELECT min(\`FILM_PLAY\`.\`WATCH_TIME\`), max(\`FILM_PLAY\`.\`WATCH_TIME\`) FROM SC1.FILM_PLAY as FILM_PLAY WHERE \`FILM_PLAY\`.\`WATCH_TIME\` >= '2017-01-01 00:00:00' AND \`FILM_PLAY\`.\`WATCH_TIME\` < '2017-12-01 00:00:00'" --null-string '' --fields-terminated-by '|' --num-mappers 4 at org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:96) at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:50) at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:61) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:166) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:166) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- Best regards, Andrey Molotov 31 окт. 2019 г., в 10:32, Andrey Molotov <[email protected]> написал(а): Hello! Thank you for your reply. Yes, could you send me compiled jdbc-driver for PostgreSQL so I could also test it? Thank you. — Best Regards, Molotov Andrey 30 окт. 2019 г., в 06:35, Xiaoxiang Yu <[email protected]> написал(а): Hi Molotov, The PR is under review and test, and In my side it is OK, you can check the test with screenshot at page (https://github.com/apache/kylin/pull/902) to see if it is tested well. If you want to test it at your env, please let me know, and I will send the binary to you. ---------------- Best wishes, Xiaoxiang Yu 在 2019/10/28 15:08,“Andrey Molotov”<[email protected]> 写入: Hello, thank you for your answer. I pulled the commit you provided and compiled jar file (two jar files, actually: kylin-source-jdbc-3.0.0-SNAPSHOT.jar and kylin-jdbc-3.0.0-SNAPSHOT.jar). Then for each of these files I did following: renamed it and put it instead of existing kylin-jdbc-2.6.4.jar file in kylin/lib directory. But unfortunately this did help me resolve my problem with the backtick in SQL query. Is there any other way to get a proper query line for PostgreSQL or maybe I did something wrong? Thanks in advance. 16 окт. 2019 г., в 02:51, "[email protected]" <[email protected]> написал(а): Hi, Molotov, because postgresql's syntax and metadata have certain specialities, need to do some development work. PR https://github.com/apache/kylin/pull/747 <https://github.com/apache/kylin/pull/747> id doing this kind of thing, it is in review now. 在 2019年10月15日,20:54,Andrey Molotov <[email protected]> 写道: Hello, everyone. I’ve set up Kylin to access a PostgreSQL Database using JDBC as described in http://kylin.apache.org/docs/tutorial/setup_jdbc_datasource.html . I’ve also set kylin.source.default=16 and kylin.source.hive.enable.quote=false in kylin.properties. But when I try to build a cube a get an error on #1 Step Name: Sqoop To Flat Hive Table. My Kylin Version is 2.6.4. Here is log: java.io.IOException: OS command error exit with return code: 1, error message: Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2019-10-15 08:40:23,908 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 2019-10-15 08:40:23,936 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 2019-10-15 08:40:24,004 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 2019-10-15 08:40:24,017 INFO manager.SqlManager: Using default fetchSize of 1000 2019-10-15 08:40:24,017 INFO tool.CodeGenTool: Beginning code generation 2019-10-15 08:40:24,164 INFO manager.SqlManager: Executing SQL statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", "installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", "installations"."install_datetime" FROM "data"."installations" AS "installations" WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND "installations"."install_datetime" < '2019-01-03') AND (1 = 0) 2019-10-15 08:40:24,176 INFO manager.SqlManager: Executing SQL statement: SELECT "installations"."city" AS "INSTALLATIONS_CITY", "installations"."device_type" AS "INSTALLATIONS_DEVICE_TYPE", "installations"."install_datetime" FROM "data"."installations" AS "installations" WHERE 1 = 1 AND ("installations"."install_datetime" >= '2019-01-01' AND "installations"."install_datetime" < '2019-01-03') AND (1 = 0) 2019-10-15 08:40:24,200 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/hadoop Note: /tmp/sqoop-hadoop/compile/33bbb7f633bb5f8338ed0a8e1e7ce3cc/QueryResult.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 2019-10-15 08:40:25,545 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/33bbb7f633bb5f8338ed0a8e1e7ce3cc/QueryResult.jar 2019-10-15 08:40:25,564 INFO mapreduce.ImportJobBase: Beginning query import. 2019-10-15 08:40:25,565 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 2019-10-15 08:40:25,650 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 2019-10-15 08:40:26,327 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2019-10-15 08:40:26,392 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 2019-10-15 08:40:26,640 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1571114338286_0004 2019-10-15 08:40:33,776 INFO db.DBInputFormat: Using read commited transaction isolation 2019-10-15 08:40:33,777 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN("installations"."install_datetime"), MAX("installations"."install_datetime") FROM "data"."installations" AS "installations" WHERE `INSTALLATIONS`.`INSTALL_DATETIME` >= '2019-01-01' AND `INSTALLATIONS`.`INSTALL_DATETIME` < '2019-01-03' 2019-10-15 08:40:33,785 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1571114338286_0004 2019-10-15 08:40:33,789 ERROR tool.ImportTool: Import failed: java.io.IOException: org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 164 at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:207) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588) at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:200) at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:173) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:270) at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:748) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:522) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243) at org.apache.sqoop.Sqoop.main(Sqoop.java:252) Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "." Position: 164 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2497) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2233) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310) at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:446) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:370) at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:311) at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:297) at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:274) at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:225) at org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:178) ... 22 more The command is: /opt/sqoop/bin/sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true -Dmapreduce.job.queuename=default --connect "jdbc:postgresql://172.0.0.1:8080/database" --driver org.postgresql.Driver --username test --password test --query "SELECT \"installations\".\"city\" AS \"INSTALLATIONS_CITY\", \"installations\".\"device_type\" AS \"INSTALLATIONS_DEVICE_TYPE\", \"installations\".\"install_datetime\" FROM \"data\".\"installations\" AS \"installations\" WHERE 1 = 1 AND (\"installations\".\"install_datetime\" >= '2019-01-01' AND \"installations\".\"install_datetime\" < '2019-01-03') AND \$CONDITIONS" --target-dir hdfs://localhost:9000/kylin/kylin_metadata/kylin-530cd3bb-f258-6ff8-9c38-9514a2258848/kylin_intermediate_7_5c51bb9d_b1ad_1135_a08f_12139ba64070 --split-by \"installations\".\"install_datetime\" --boundary-query "SELECT MIN(\"installations\".\"install_datetime\"), MAX(\"installations\".\"install_datetime\") FROM \"data\".\"installations\" AS \"installations\" WHERE \`INSTALLATIONS\`.\`INSTALL_DATETIME\` >= '2019-01-01' AND \`INSTALLATIONS\`.\`INSTALL_DATETIME\` < '2019-01-03'" --null-string '' --fields-terminated-by '|' --num-mappers 4 at org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:96) at org.apache.kylin.source.jdbc.sqoop.SqoopCmdStep.sqoopFlatHiveTable(SqoopCmdStep.java:59) at org.apache.kylin.source.jdbc.sqoop.SqoopCmdStep.doWork(SqoopCmdStep.java:70) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Please, help me resolve the problem.
