[jira] [Commented] (KYLIN-4383) Kylin Integrated Issue with Amazon EMR and AWS Glue in HiveMetaStoreClientFactory.java
[ https://issues.apache.org/jira/browse/KYLIN-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041534#comment-17041534 ] Kaige Liu commented on KYLIN-4383: --- [~shtian] You can refer this as a workaround before Xiaoxiang fix it https://issues.apache.org/jira/browse/KYLIN-3685?focusedCommentId=17002995=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17002995 > Kylin Integrated Issue with Amazon EMR and AWS Glue in > HiveMetaStoreClientFactory.java > -- > > Key: KYLIN-4383 > URL: https://issues.apache.org/jira/browse/KYLIN-4383 > Project: Kylin > Issue Type: Bug > Components: Integration >Affects Versions: v3.0.0 > Environment: Amazon EMR 5.29.0 + AWS Glue >Reporter: Tian Shi >Assignee: Xiaoxiang Yu >Priority: Blocker > Labels: AWS, easyfix > Original Estimate: 24h > Remaining Estimate: 24h > > Following the official docs with [link Install Kylin on AWS > EMR|http://kylin.apache.org/docs31/install/kylin_aws_emr.html], when choosing > AWS Glue as Hive metastore, it does not take effect with the configuration > "kylin.source.hive.metadata-type=gluecatalog". > Then I clone the master branch and compile Kylin from the source code > (3.1.0-SNAPSHOT) and deploy it on EMR with Glue, when "Load table from the > tree", I encountered with error "cannot create metastore client gluecatalog". -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4079) Concurrent query requests using Query API makes the query execution take too much time
[ https://issues.apache.org/jira/browse/KYLIN-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026078#comment-17026078 ] Kaige Liu commented on KYLIN-4079: --- That makes sense. There will be always a concurrency threshold for every query engine. For kylin, It usually depends on your HBase cluster, the resources (cpu/mem) kylin server can get and the complexity of your queries. This article shows kylin reaches 90 qps. [https://kyligence.io/blog/how-ciscos-big-data-team-improved-apache-kylins-high-concurrent-throughput-by-5x/] And there is always more space to increate qps in kylin. If you can catch the bottleneck in your environment, and share more information with the community, that will be very great! > Concurrent query requests using Query API makes the query execution take too > much time > -- > > Key: KYLIN-4079 > URL: https://issues.apache.org/jira/browse/KYLIN-4079 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Reporter: Gladson Vas >Priority: Blocker > Attachments: NormalQueryExecution.jpg, NormalQueryExecution1.jpg, > SlowQueryExecution.jpg, SlowQueryExecution1.jpg > > > Hi, > When 40 queries are executed on kylin parallelly (at once) > the response time distribution is as below, with min, max and mean response > time as 6557,7580 and 6887 ms respectively. > Kylin doesn’t seem to handle parallel execution of queries. > !SlowQueryExecution.jpg! > !SlowQueryExecution1.jpg! > But when 40 queries are executed over 40 seconds i.e one > query executed per second the response time distribution is as below, with > min, max and mean response time as 110,2820 and 248 ms respectively. > This seems fine as the mean response time in less than a > second. > > !NormalQueryExecution.jpg! > !NormalQueryExecution1.jpg! > There seems to be some problem when queries are fired in > parallel to kylin.Please help. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4362) Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.
[ https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025905#comment-17025905 ] Kaige Liu commented on KYLIN-4362: --- Hi [~codingforfun] [~sonuSINGH], I believe the link weibin posted is not the root cause. Missing $ACCUMULO_HOME is only a warning message. The root cause is that kylin did not fetch the correct split-by column and put a empty parameter in sqoop command. [~sonuSINGH] If you can share your table DDL here, that will be very helpful for debugging. By saying table DDL, I mean the db name, table name, column names and data types. Thanks. > Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop. > -- > > Key: KYLIN-4362 > URL: https://issues.apache.org/jira/browse/KYLIN-4362 > Project: Kylin > Issue Type: Bug >Reporter: Sonu Singh >Assignee: weibin0516 >Priority: Blocker > Fix For: v3.0.0 > > Attachments: image-2020-01-28-11-39-59-098.png > > > MR and SPark job are failing on HDP3.1 with below error: > -00 execute finished with exception > java.io.IOException: OS command error exit with return code: 1, error > message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo > imports will fail. > Please set $ACCUMULO_HOME to the root of your Accumulo installation. > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187 > Missing argument for option: split-by > The command is: > /usr/hdp/current/sqoop-client/bin/sqoop import > -Dorg.apache.sqoop.splitter.allow_text_splitter=true > -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ > /XX.XX.XX.XX:XX/X" --driver com..XX.jdbc.Driver --username X > --password "XXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as > \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as > \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ > BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` > ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`\`.\`sales\` > \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir > hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701 > 2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query > "SELECT min(\`\`), max(\`\`) FROM \`XX\`.\`sales\` " --null-string '\\N' > --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4 > at > org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88) > at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43) > at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : > job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR > 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : > error running Executabl e: > CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - > cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING} > 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : > Servers in the cluster: [localhost:7070] > 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : > Announcing new bro > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4362) Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.
[ https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024842#comment-17024842 ] Kaige Liu commented on KYLIN-4362: --- Split-by column is missed in the generated sqoop command. Can you please share your table DDL to debug this issue? > Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop. > -- > > Key: KYLIN-4362 > URL: https://issues.apache.org/jira/browse/KYLIN-4362 > Project: Kylin > Issue Type: Bug >Reporter: Sonu Singh >Priority: Blocker > Fix For: v3.0.0 > > > MR and SPark job are failing on HDP3.1 with below error: > -00 execute finished with exception > java.io.IOException: OS command error exit with return code: 1, error > message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo > imports will fail. > Please set $ACCUMULO_HOME to the root of your Accumulo installation. > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187 > Missing argument for option: split-by > The command is: > /usr/hdp/current/sqoop-client/bin/sqoop import > -Dorg.apache.sqoop.splitter.allow_text_splitter=true > -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ > /XX.XX.XX.XX:XX/X" --driver com..XX.jdbc.Driver --username X > --password "XXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as > \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as > \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ > BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` > ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`\`.\`sales\` > \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir > hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701 > 2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query > "SELECT min(\`\`), max(\`\`) FROM \`XX\`.\`sales\` " --null-string '\\N' > --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4 > at > org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88) > at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43) > at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62) > at > org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171) > at > org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : > job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR > 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job > 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : > error running Executabl e: > CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - > cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING} > 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : > Servers in the cluster: [localhost:7070] > 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : > Announcing new bro > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4331) Follow compliant rules in Sonar
Kaige Liu created KYLIN-4331: - Summary: Follow compliant rules in Sonar Key: KYLIN-4331 URL: https://issues.apache.org/jira/browse/KYLIN-4331 Project: Kylin Issue Type: Improvement Reporter: Kaige Liu Assignee: Kaige Liu When a method in a child class has the same signature as a method in a parent class, it is assumed to be an override. However, that's not the case when: * the parent class method is {{static}} and the child class method is not. * the arguments or return types of the child method are in different packages than those of the parent method. * the parent class method is {{private}}. Typically, these things are done unintentionally; the private parent class method is overlooked, the {{static}} keyword in the parent declaration is overlooked, or the wrong class is imported in the child. But if the intent is truly for the child class method to be different, then the method should be renamed to prevent confusion. [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWP9sMDe3e-qcckjAB5V=AWP9sMDe3e-qcckjAB5V] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWcaThjuH5xombRgErVV=AWcaThjuH5xombRgErVV] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWxmA253_Xcr_PhA6-Br=AWxmA253_Xcr_PhA6-Br] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWxmA253_Xcr_PhA6-Bw=AWxmA253_Xcr_PhA6-Bw] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWxmA253_Xcr_PhA6-By=AWxmA253_Xcr_PhA6-By] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWxmA253_Xcr_PhA6-B1=AWxmA253_Xcr_PhA6-B1] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3685) AWS Glue Catalog Not Supported
[ https://issues.apache.org/jira/browse/KYLIN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002995#comment-17002995 ] Kaige Liu commented on KYLIN-3685: --- Hi [~rjarvis], [~rongneng.wei], There is a solution to this issue. You can give it a shot as below steps: 1) Use beeline instead of Hive CLI to connect Hive metastore. Change configurations in kylin.properties {quote}kylin.source.hive.client=beeline kylin.source.hive.beeline-params=-u jdbc:hive2://ip-172-31-84-101.ec2.internal:1 -n root {quote} 2) copy missed jars {quote}cp /usr/share/aws/hmclient/lib/aws-glue-datacatalog-hive2-client-1.11.0.jar $KYLIN_HOME/ext cp $KYLIN_HOME/spark/jars/joda-time-2.9.3.jar $KYLIN_HOME/lib {quote} I have tried this on AWS EMR 5.28. It works well. _*Root cause analysis*_ 1. Kylin connects Hive metastore via HiveMetaStoreClient like this: {code:java} private HiveMetaStoreClient getMetaStoreClient() throws Exception { if (metaStoreClient == null) { metaStoreClient = new HiveMetaStoreClient(hiveConf); } return metaStoreClient; } {code} This will ignore the configurations in hive-site.xml cause it initializes the client directly. {quote} hive.metastore.client.factory.class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory {quote} When changing to beeline, the client will not be created by kylin and beeline can handle this properly. 2. We need to add /usr/share/aws/hmclient/lib/aws-glue-datacatalog-hive2-client-1.11.0.jar to classpath to avoid below error: {quote}java.lang.RuntimeException: java.io.IOException: MetaException(message:Unable to instantiate a metastore client factory com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory due to: java.lang.ClassNotFoundException: Class com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory not found) at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126) at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104) at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:144) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71) at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {quote} 2. Why do we need to copy joda-time-2.9.3.jar to $KYLIN_HOME/lib? AWS java SDK uses a newer version of joda-time while hbase introduces an old version joda-time( < 2.0 ) shipped with jruby-complete-1.6.8.jar . Putting the new version to $KYLIN_HOME/lib so that it will appear in front of jruby-complete-1.6.8.jar in classpath. If not, below error will occur {quote}org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException: com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/format/DateTimeFormatter; at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:194) at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.kylin.job.exception.ExecuteException: com.google.common.util.concurrent.ExecutionError: java.lang.NoSuchMethodError: org.joda.time.format.DateTimeFormatter.withZoneUTC()Lorg/joda/time/format/DateTimeFormatter; {quote} > AWS Glue Catalog Not Supported > -- > > Key: KYLIN-3685 > URL: https://issues.apache.org/jira/browse/KYLIN-3685 > Project: Kylin > Issue Type: Bug > Components: Integration >Affects Versions: v2.5.0 >Reporter: Richard Jarvis >Assignee: Kaige Liu >Priority: Major > > I am trying to use Kylin on AWS (EMR 5.18.0). > I use AWS Glue as the catalog and as a result Kylin can't find the tables. > I am able to see the schemas and tables in the GUI because I have set the AWS > glue properties in hive-site.xml: > > > hive.metastore.client.factory.class >
[jira] [Created] (KYLIN-4311) Fix bugs in Sonar to be compliant
Kaige Liu created KYLIN-4311: - Summary: Fix bugs in Sonar to be compliant Key: KYLIN-4311 URL: https://issues.apache.org/jira/browse/KYLIN-4311 Project: Kylin Issue Type: Improvement Reporter: Kaige Liu Assignee: Kaige Liu By contract, any implementation of the {{java.util.Iterator.next()}} method should throw a {{NoSuchElementException}} exception when the iteration has no more elements. Any other behavior when the iteration is done could lead to unexpected behavior for users of this {{Iterator}}. [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWExwNw9ikuHJGLsvan_=AWExwNw9ikuHJGLsvan_] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWExwNxOikuHJGLsvaoZ=AWExwNxOikuHJGLsvaoZ] [https://sonarcloud.io/project/issues?id=org.apache.kylin%3Akylin=AWExwOHQikuHJGLsvbDO=AWExwOHQikuHJGLsvbDO] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KYLIN-3685) AWS Glue Catalog Not Supported
[ https://issues.apache.org/jira/browse/KYLIN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3685: - Assignee: Kaige Liu > AWS Glue Catalog Not Supported > -- > > Key: KYLIN-3685 > URL: https://issues.apache.org/jira/browse/KYLIN-3685 > Project: Kylin > Issue Type: Bug > Components: Integration >Affects Versions: v2.5.0 >Reporter: Richard Jarvis >Assignee: Kaige Liu >Priority: Major > > I am trying to use Kylin on AWS (EMR 5.18.0). > I use AWS Glue as the catalog and as a result Kylin can't find the tables. > I am able to see the schemas and tables in the GUI because I have set the AWS > glue properties in hive-site.xml: > > > hive.metastore.client.factory.class > com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory > > However, the job > org.apache.kylin.source.hive.cardinality.HiveColumnCardinalityJob fails to > find the tables (it's looking in the Hive metadata catalog instead of AWS > Glue). > I think this is because Hive 1.2.1 is too old to support the client factory > class. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-3672) Performance is poor when multiple queries occur in short period
[ https://issues.apache.org/jira/browse/KYLIN-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679588#comment-16679588 ] Kaige Liu commented on KYLIN-3672: --- Impressive work and cool analysis! > Performance is poor when multiple queries occur in short period > --- > > Key: KYLIN-3672 > URL: https://issues.apache.org/jira/browse/KYLIN-3672 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v2.5.0 > Environment: CentOS 6.7, HBase 1.2.0+cdh5.14.2+456 >Reporter: Zongwei Li >Assignee: Zongwei Li >Priority: Critical > Labels: patch, performance > Fix For: v2.6.0 > > Attachments: KYLIN-3672.master.001.patch, TrendChartBeforeFix.png, > codeChangedCausedThisBug.png, jstackBeforeBugFix.log > > > Hi, Kylin Team > We found one Kylin performance bug during performance tuning for our BI > report integrate with Kylin. > > +Background+ > Our BI report show customer usage report to enterprise customers, provide 15 > usage charts in report page. > Each chart need send API request to Kylin with different SQLs. So it means > for one user, it will trigger 15 API calls(by JDBC) to Kylin. > For our product scale, we need support at least 20 users to review the report > at same time for each Kylin query node. > So it means each Kylin node should be able to handle 15 * 20 = 300 queries > per second. > > +Performance Report+ > To reduce the network impact. We built up Kylin cluster and testing machine > in the same network with Hadoop system. > We use gatling and Jmeter tools to do several round testing, result as follow. > > |Thread|Handled Queries (in 60 seconds)|Handled Queries (per second)|Mean > Response Time > (ms)| > |1|773|13|77| > |15|3245|54|279| > |25|3844|64|390| > |50|4912|82|612| > |75|5405|90|841| > |100|5436|91|1108| > |150|5434|91|1688| > > And draw the trend chart as follow: > !TrendChartBeforeFix.png! > > +Conclusion+ > From the trend, when the thread count reach 75, the handled queries per > second reaches peak data 90, and cannot improved by increase the thread count. > Each Kylin query engine can handle 90 queries per second, it means only > support 90/15 = 6 users to review report page at same time. > Even we setup 3 query nodes, can extend to 18 users at same time, this > performance capacity cannot meet our business requirement. > > +Analyze+ > From test result, response for one thread is fast, but as the thread > increase, throughput of Kylin not increased as we expected. > We have full code review for Kylin query engine, and use Jstack and JProfile > to do analyze, found the root cause for this performance bottleneck. > This is one regression bug introduced by new feature involved one year before. > With bug fixing, one Kylin node can handle 350+ queries per second. Submit > this bug for contribute patch to Kylin. > +Jstack Log Analyze+ > We use Jstack to capture thread info during performance testing. Already > attach one of them 'jstackBeforeBugFix.log'. > From the log, we can found that > One thread locked at sun.misc.URLClassPath.getNextLoader. TID is > {color:#ff}*0x00048007a180*{color} > > {{"Query e9c44a2d-6226-ff3b-f984-ce8489107d79-3425"}} {{#}}{{3425}} {{daemon > prio=}}{{5}} {{os_prio=}}{{0}} {{tid=}}{{0x0472b000}} > {{nid=}}{{0x1433}} {{waiting }}{{for}} {{monitor entry > [}}\{{0x7f272e40d000}}{{]}} > > {{ }}{{java.lang.Thread.State: BLOCKED (on object monitor)}} > > {{}}{{at > sun.misc.URLClassPath.getNextLoader(URLClassPath.java:}}{{469}}{{)}} > > {{}}{{- locked <}}{{0x00048007a180}}{{> (a sun.misc.URLClassPath)}} > > {{}}{{at > sun.misc.URLClassPath.findResource(URLClassPath.java:}}{{214}}{{)}} > > {{}}{{at > java.net.URLClassLoader$}}{{2}}{{.run(URLClassLoader.java:}}{{569}}{{)}} > > {{}}{{at > java.net.URLClassLoader$}}{{2}}{{.run(URLClassLoader.java:}}{{567}}{{)}} > > {{}}{{at java.security.AccessController.doPrivileged(Native Method)}} > > {{}}{{at > java.net.URLClassLoader.findResource(URLClassLoader.java:}}{{566}}{{)}} > > {{}}{{at > java.lang.ClassLoader.getResource(ClassLoader.java:}}{{1096}}{{)}} > > {{}}{{at > java.lang.ClassLoader.getResource(ClassLoader.java:}}{{1091}}{{)}} > > {{}}{{at > org.apache.catalina.loader.WebappClassLoaderBase.getResource(WebappClassLoaderBase.java:}}{{1666}}{{)}} > > {{}}{{at > org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:}}{{338}}{{)}} > > 43 threads waiting to lock <{color:#ff}*0x00048007a180*{color}> > > {{"Query f1f0bbec-a3f7-04b2-1ac6-fd3e03a0232d-4002"}} {{#}}{{4002}} {{daemon > prio=}}{{5}} {{os_prio=}}{{0}}
[jira] [Assigned] (KYLIN-3556) Interned string should not be used as lock object
[ https://issues.apache.org/jira/browse/KYLIN-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3556: - Assignee: Kaige Liu > Interned string should not be used as lock object > - > > Key: KYLIN-3556 > URL: https://issues.apache.org/jira/browse/KYLIN-3556 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v2.5.0 >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.5.1 > > > In JDBCResourceDAO : > {code} > public void execute(Connection connection) throws SQLException { > synchronized (resPath.intern()) { > {code} > Locking on an interned string can cause unexpected locking collisions with > other part of code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (KYLIN-3344) Add GUI to support RDBMS data source
[ https://issues.apache.org/jira/browse/KYLIN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu closed KYLIN-3344. Resolution: Duplicate > Add GUI to support RDBMS data source > > > Key: KYLIN-3344 > URL: https://issues.apache.org/jira/browse/KYLIN-3344 > Project: Kylin > Issue Type: Bug > Components: Web >Reporter: Kaige Liu >Assignee: Zhixiong Chen >Priority: Minor > > Since Kylin has already supported RDBMS as data source, it should add web GUI > to load RDBMS tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3344) Add GUI to support RDBMS data source
[ https://issues.apache.org/jira/browse/KYLIN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443883#comment-16443883 ] Kaige Liu commented on KYLIN-3344: --- Duplicated with KYLIN-3343 > Add GUI to support RDBMS data source > > > Key: KYLIN-3344 > URL: https://issues.apache.org/jira/browse/KYLIN-3344 > Project: Kylin > Issue Type: Bug > Components: Web >Reporter: Kaige Liu >Assignee: Zhixiong Chen >Priority: Minor > > Since Kylin has already supported RDBMS as data source, it should add web GUI > to load RDBMS tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3344) Add GUI to support RDBMS data source
Kaige Liu created KYLIN-3344: - Summary: Add GUI to support RDBMS data source Key: KYLIN-3344 URL: https://issues.apache.org/jira/browse/KYLIN-3344 Project: Kylin Issue Type: Bug Components: Web Reporter: Kaige Liu Assignee: Zhixiong Chen Since Kylin has already supported RDBMS as data source, it should add web GUI to load RDBMS tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3259) When a cube is deleted, remove it from the hybrid cube definition
[ https://issues.apache.org/jira/browse/KYLIN-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3259: - Assignee: Kaige Liu > When a cube is deleted, remove it from the hybrid cube definition > - > > Key: KYLIN-3259 > URL: https://issues.apache.org/jira/browse/KYLIN-3259 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v2.2.0 > Environment: HDP 2.5.6, Kylin 2.2 >Reporter: Vsevolod Ostapenko >Assignee: Kaige Liu >Priority: Major > > When a cube is deleted, its references are not automatically removed from > existing hybrid cube definition. That can lead to errors down the road, if > user (or application) retrieves the list of cubes via REST API call and later > tries to update the hybrid cube. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3187) JDK APIs using the default locale, time zone or character set should be avoided
[ https://issues.apache.org/jira/browse/KYLIN-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3187: - Assignee: Kaige Liu > JDK APIs using the default locale, time zone or character set should be > avoided > --- > > Key: KYLIN-3187 > URL: https://issues.apache.org/jira/browse/KYLIN-3187 > Project: Kylin > Issue Type: Bug > Components: REST Service >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Major > Labels: usability > > Here are a few examples: > {code} > server-base/src/main/java/org/apache/kylin/rest/service/JobService.java: > Calendar calendar = Calendar.getInstance(); > storage-hbase/src/main/java/org/apache/kylin/storage/hbase/util/HbaseStreamingInput.java: > Calendar cal = Calendar.getInstance(); > {code} > Locale should be specified. > See CALCITE-1667 for related information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3135) Fix regular expression bug in SQL comments
[ https://issues.apache.org/jira/browse/KYLIN-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359768#comment-16359768 ] Kaige Liu commented on KYLIN-3135: --- As commented above, I will revert the change in [https://github.com/apache/kylin/commit/7d5fb855064e2b81cd3b154cdeeafec4e64f63c9] Please review the patch [~roger.shi] [~lidong_sjtu], Thanks. > Fix regular expression bug in SQL comments > -- > > Key: KYLIN-3135 > URL: https://issues.apache.org/jira/browse/KYLIN-3135 > Project: Kylin > Issue Type: Bug >Reporter: hahayuan >Assignee: hahayuan >Priority: Major > Fix For: v2.3.0 > > Attachments: 0001-KYLIN-3135.patch, 0002-KYLIN-3135-fix-regEx.patch, > multi_line_comments.PNG, one_line_comments.PNG > > > Hi,all. > Recently,I was testing query function of kylin, > sometimes I just comment with /**/ instead of delete the sql,cause I need to > query and compare again. > And I was confused that the results says it was "No Support Sql",but it can > query success without comments. > For example, > {code:java} > /* > select count(*) from kylin_sales; > */ > select * from kylin_sales; > {code} > So I view the code and find the commentPatterns of /\**/ was > {code:java} > /\\*[^\\*/]* > {code} > ,clearly it was wrong. > The regular expression of [abc] means any character in abc,such as a or b. > So the [^\\*/] means that * or / can't appear, > But under this circumstances the */ need to be as a string not separated > character. > the */ can't appear not * or / can't appear. > I rewrite the regular expression, > {code:java} > /\\*[\\s\\S]*?\\*/ > {code} > if you think it's necessary to change the old code,please review and replace > it. > Thank for you time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3135) Fix regular expression bug in SQL comments
[ https://issues.apache.org/jira/browse/KYLIN-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3135: - Attachment: 0002-KYLIN-3135-fix-regEx.patch > Fix regular expression bug in SQL comments > -- > > Key: KYLIN-3135 > URL: https://issues.apache.org/jira/browse/KYLIN-3135 > Project: Kylin > Issue Type: Bug >Reporter: hahayuan >Assignee: hahayuan >Priority: Major > Fix For: v2.3.0 > > Attachments: 0001-KYLIN-3135.patch, 0002-KYLIN-3135-fix-regEx.patch, > multi_line_comments.PNG, one_line_comments.PNG > > > Hi,all. > Recently,I was testing query function of kylin, > sometimes I just comment with /**/ instead of delete the sql,cause I need to > query and compare again. > And I was confused that the results says it was "No Support Sql",but it can > query success without comments. > For example, > {code:java} > /* > select count(*) from kylin_sales; > */ > select * from kylin_sales; > {code} > So I view the code and find the commentPatterns of /\**/ was > {code:java} > /\\*[^\\*/]* > {code} > ,clearly it was wrong. > The regular expression of [abc] means any character in abc,such as a or b. > So the [^\\*/] means that * or / can't appear, > But under this circumstances the */ need to be as a string not separated > character. > the */ can't appear not * or / can't appear. > I rewrite the regular expression, > {code:java} > /\\*[\\s\\S]*?\\*/ > {code} > if you think it's necessary to change the old code,please review and replace > it. > Thank for you time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2999: - Attachment: KYLIN-2999-fix-ut.patch > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > Fix For: v2.3.0 > > Attachments: KYLIN-2999-fix-cube-automigration-1.patch, > KYLIN-2999-fix-ut.patch, KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KYLIN-3135) Fix regular expression bug in SQL comments
[ https://issues.apache.org/jira/browse/KYLIN-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346156#comment-16346156 ] Kaige Liu edited comment on KYLIN-3135 at 1/31/18 2:34 AM: Hi [~hahayuan], you are right. The commit is not correct according to [https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#predef] ||Predefined character classes|| |{{.}}|Any character (may or may not match [line terminators|https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#lt])| |{{\s}}|A whitespace character: {{[ \t\n\x0B\f\r]}}| |{{\S}}|A non-whitespace character: {{[^\s]}}| So *"/\\*.*?*/"* won't match multiple lines comment. was (Author: liukaige): Hi [~hahayuan], you are right. The commit is not correct according to [https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#predef] ||Predefined character classes|| |{{.}}|Any character (may or may not match [line terminators|https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#lt])| |{{\s}}|A whitespace character: {{[ \t\n\x0B\f\r]}}| |{{\S}}|A non-whitespace character: {{[^\s]}}| So "/\\*.*?\\*/" won't match multiple lines comment. > Fix regular expression bug in SQL comments > -- > > Key: KYLIN-3135 > URL: https://issues.apache.org/jira/browse/KYLIN-3135 > Project: Kylin > Issue Type: Bug >Reporter: hahayuan >Assignee: hahayuan >Priority: Major > Fix For: v2.3.0 > > Attachments: 0001-KYLIN-3135.patch, multi_line_comments.PNG, > one_line_comments.PNG > > > Hi,all. > Recently,I was testing query function of kylin, > sometimes I just comment with /**/ instead of delete the sql,cause I need to > query and compare again. > And I was confused that the results says it was "No Support Sql",but it can > query success without comments. > For example, > {code:java} > /* > select count(*) from kylin_sales; > */ > select * from kylin_sales; > {code} > So I view the code and find the commentPatterns of /\**/ was > {code:java} > /\\*[^\\*/]* > {code} > ,clearly it was wrong. > The regular expression of [abc] means any character in abc,such as a or b. > So the [^\\*/] means that * or / can't appear, > But under this circumstances the */ need to be as a string not separated > character. > the */ can't appear not * or / can't appear. > I rewrite the regular expression, > {code:java} > /\\*[\\s\\S]*?\\*/ > {code} > if you think it's necessary to change the old code,please review and replace > it. > Thank for you time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3135) Fix regular expression bug in SQL comments
[ https://issues.apache.org/jira/browse/KYLIN-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346156#comment-16346156 ] Kaige Liu commented on KYLIN-3135: --- Hi [~hahayuan], you are right. The commit is not correct according to [https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#predef] ||Predefined character classes|| |{{.}}|Any character (may or may not match [line terminators|https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#lt])| |{{\s}}|A whitespace character: {{[ \t\n\x0B\f\r]}}| |{{\S}}|A non-whitespace character: {{[^\s]}}| So "/\\*.*?\\*/" won't match multiple lines comment. > Fix regular expression bug in SQL comments > -- > > Key: KYLIN-3135 > URL: https://issues.apache.org/jira/browse/KYLIN-3135 > Project: Kylin > Issue Type: Bug >Reporter: hahayuan >Assignee: hahayuan >Priority: Major > Fix For: v2.3.0 > > Attachments: 0001-KYLIN-3135.patch, multi_line_comments.PNG, > one_line_comments.PNG > > > Hi,all. > Recently,I was testing query function of kylin, > sometimes I just comment with /**/ instead of delete the sql,cause I need to > query and compare again. > And I was confused that the results says it was "No Support Sql",but it can > query success without comments. > For example, > {code:java} > /* > select count(*) from kylin_sales; > */ > select * from kylin_sales; > {code} > So I view the code and find the commentPatterns of /\**/ was > {code:java} > /\\*[^\\*/]* > {code} > ,clearly it was wrong. > The regular expression of [abc] means any character in abc,such as a or b. > So the [^\\*/] means that * or / can't appear, > But under this circumstances the */ need to be as a string not separated > character. > the */ can't appear not * or / can't appear. > I rewrite the regular expression, > {code:java} > /\\*[\\s\\S]*?\\*/ > {code} > if you think it's necessary to change the old code,please review and replace > it. > Thank for you time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3204) Potentially unclosed resources in JdbcExplorer#evalQueryMetadata
[ https://issues.apache.org/jira/browse/KYLIN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3204: - Attachment: KYLIN-3204-fix-potentially-unclosed-resources.patch > Potentially unclosed resources in JdbcExplorer#evalQueryMetadata > > > Key: KYLIN-3204 > URL: https://issues.apache.org/jira/browse/KYLIN-3204 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > Attachments: KYLIN-3204-fix-potentially-unclosed-resources.patch > > > {code} > Connection con = SqlUtil.getConnection(dbconf); > DatabaseMetaData dbmd = con.getMetaData(); > ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); > {code} > con and rs should be closed upon return even if there is exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KYLIN-3204) Potentially unclosed resources in JdbcExplorer#evalQueryMetadata
[ https://issues.apache.org/jira/browse/KYLIN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3204: - Assignee: Kaige Liu > Potentially unclosed resources in JdbcExplorer#evalQueryMetadata > > > Key: KYLIN-3204 > URL: https://issues.apache.org/jira/browse/KYLIN-3204 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Major > Attachments: KYLIN-3204-fix-potentially-unclosed-resources.patch > > > {code} > Connection con = SqlUtil.getConnection(dbconf); > DatabaseMetaData dbmd = con.getMetaData(); > ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); > {code} > con and rs should be closed upon return even if there is exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3204) Potentially unclosed resources in JdbcExplorer#evalQueryMetadata
[ https://issues.apache.org/jira/browse/KYLIN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344716#comment-16344716 ] Kaige Liu commented on KYLIN-3204: --- Patch attached. [~lidong_sjtu], please help review. Thanks. > Potentially unclosed resources in JdbcExplorer#evalQueryMetadata > > > Key: KYLIN-3204 > URL: https://issues.apache.org/jira/browse/KYLIN-3204 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > Attachments: KYLIN-3204-fix-potentially-unclosed-resources.patch > > > {code} > Connection con = SqlUtil.getConnection(dbconf); > DatabaseMetaData dbmd = con.getMetaData(); > ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); > {code} > con and rs should be closed upon return even if there is exception. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344608#comment-16344608 ] Kaige Liu commented on KYLIN-2999: --- Hi [~kangkaisen] [~yimingliu], I also found auto-migration not working as expected. I have submitted a patch to make it work. But it only fixes the blocking issue. There are still some defects need to be improved. # If the dest env is totally fresh, the migration will fail, because the hdfs folder(/$KYLIN_HOME/$metadata) is not created until any jobs is triggered. # The job is running in blocking model, maybe we can add it to job scheduler. Sample configuration: kylin.tool.auto-migrate-cube.enabled=true kylin.tool.auto-migrate-cube.src-config=/opt/kylintest/apache-kylin-2.3.0-SNAPSHOT-bin/conf/kylin.properties kylin.tool.auto-migrate-cube.dest-config=/tmp/kylin.properties (Above configuration files must be named as *kylin.properties*) Prerequisite: Must have a same project with source cube in dest env. Please review the patch. Thanks. > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > Fix For: v2.3.0 > > Attachments: KYLIN-2999-fix-cube-automigration-1.patch, > KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344608#comment-16344608 ] Kaige Liu edited comment on KYLIN-2999 at 1/30/18 7:23 AM: Hi [~kangkaisen] [~yimingliu], I also found auto-migration not working as expected. I have submitted a patch to make it work. But it only fixes the blocking issue. There are still some defects need to be improved. # If the dest env is totally fresh, the migration will fail, because the hdfs folder(/$KYLIN_HOME/$metadata) is not created until any jobs is triggered. # The job is running in blocking mode, maybe we can add it to job scheduler. Sample configuration: kylin.tool.auto-migrate-cube.enabled=true kylin.tool.auto-migrate-cube.src-config=/opt/kylintest/apache-kylin-2.3.0-SNAPSHOT-bin/conf/kylin.properties kylin.tool.auto-migrate-cube.dest-config=/tmp/kylin.properties (Above configuration files must be named as *kylin.properties*) Prerequisite: Must have a same project with source cube in dest env. Please review the patch. Thanks. was (Author: liukaige): Hi [~kangkaisen] [~yimingliu], I also found auto-migration not working as expected. I have submitted a patch to make it work. But it only fixes the blocking issue. There are still some defects need to be improved. # If the dest env is totally fresh, the migration will fail, because the hdfs folder(/$KYLIN_HOME/$metadata) is not created until any jobs is triggered. # The job is running in blocking model, maybe we can add it to job scheduler. Sample configuration: kylin.tool.auto-migrate-cube.enabled=true kylin.tool.auto-migrate-cube.src-config=/opt/kylintest/apache-kylin-2.3.0-SNAPSHOT-bin/conf/kylin.properties kylin.tool.auto-migrate-cube.dest-config=/tmp/kylin.properties (Above configuration files must be named as *kylin.properties*) Prerequisite: Must have a same project with source cube in dest env. Please review the patch. Thanks. > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > Fix For: v2.3.0 > > Attachments: KYLIN-2999-fix-cube-automigration-1.patch, > KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-2999) One click migrate cube in web
[ https://issues.apache.org/jira/browse/KYLIN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2999: - Attachment: KYLIN-2999-fix-cube-automigration-1.patch > One click migrate cube in web > - > > Key: KYLIN-2999 > URL: https://issues.apache.org/jira/browse/KYLIN-2999 > Project: Kylin > Issue Type: New Feature > Components: Tools, Build and Test, Web >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Major > Fix For: v2.3.0 > > Attachments: KYLIN-2999-fix-cube-automigration-1.patch, > KYLIN-2999.patch > > > Currently, the cube migration must be done by Kylin Admin, which will waste > a lot of time for Kylin Admin. So, we should allow use to migrate cube by one > click in web. Of Course, which is configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3201) java.lang.RuntimeException: native lz4 library not available
[ https://issues.apache.org/jira/browse/KYLIN-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342919#comment-16342919 ] Kaige Liu commented on KYLIN-3201: --- Try this: Add "*kylin.engine.spark-conf.executor.extraJavaOptions -Dhdp.version=2.6.2.3-1 Djava.library.path=/usr/hdp/current/hadoop-client/lib/native/*" in kylin.propertes Above example is for HDP 2.6. Modify it according to your Hadoop distribution. > java.lang.RuntimeException: native lz4 library not available > > > Key: KYLIN-3201 > URL: https://issues.apache.org/jira/browse/KYLIN-3201 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.1.0 >Reporter: Keith Chen >Priority: Critical > > When i build cube with spark , the job was failed. > It report some exceptions about lz4, but i do not set lz4 properties. > > 18/01/26 13:18:40 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 > (TID 0, executor 1): java.lang.RuntimeException: native lz4 library not > available > at > org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195) > at > org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983) > at > org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1827) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1841) > at > org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:49) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64) > at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:252) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:251) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) > Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task > 2.3 in stage 0.0 (TID 9, executor 1): java.lang.RuntimeException: native lz4 > library not available > at >
[jira] [Comment Edited] (KYLIN-3204) Unclosed resources in JdbcExplorer#evalQueryMetadata
[ https://issues.apache.org/jira/browse/KYLIN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341979#comment-16341979 ] Kaige Liu edited comment on KYLIN-3204 at 1/27/18 5:39 AM: Hi [~yuzhih...@gmail.com], I think con and rs will be closed properly. {quote}DatabaseMetaData dbmd = con.getMetaData(); ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); ColumnDesc[] result = extractColumnFromMeta(rs); *DBUtils.closeQuietly(rs);* *DBUtils.closeQuietly(con);* {quote} was (Author: liukaige): Hi [~yuzhih...@gmail.com], I think con and rs will be close properly. {quote}DatabaseMetaData dbmd = con.getMetaData(); ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); ColumnDesc[] result = extractColumnFromMeta(rs); *DBUtils.closeQuietly(rs);* *DBUtils.closeQuietly(con);* {quote} > Unclosed resources in JdbcExplorer#evalQueryMetadata > > > Key: KYLIN-3204 > URL: https://issues.apache.org/jira/browse/KYLIN-3204 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > > {code} > Connection con = SqlUtil.getConnection(dbconf); > DatabaseMetaData dbmd = con.getMetaData(); > ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); > {code} > con and rs should be closed upon return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3204) Unclosed resources in JdbcExplorer#evalQueryMetadata
[ https://issues.apache.org/jira/browse/KYLIN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341979#comment-16341979 ] Kaige Liu commented on KYLIN-3204: --- Hi [~yuzhih...@gmail.com], I think con and rs will be close properly. {quote}DatabaseMetaData dbmd = con.getMetaData(); ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); ColumnDesc[] result = extractColumnFromMeta(rs); *DBUtils.closeQuietly(rs);* *DBUtils.closeQuietly(con);* {quote} > Unclosed resources in JdbcExplorer#evalQueryMetadata > > > Key: KYLIN-3204 > URL: https://issues.apache.org/jira/browse/KYLIN-3204 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Priority: Major > > {code} > Connection con = SqlUtil.getConnection(dbconf); > DatabaseMetaData dbmd = con.getMetaData(); > ResultSet rs = dbmd.getColumns(null, tmpDatabase, tmpView, null); > {code} > con and rs should be closed upon return. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
***UNCHECKED*** [jira] [Commented] (KYLIN-3201) java.lang.RuntimeException: native lz4 library not available
[ https://issues.apache.org/jira/browse/KYLIN-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340821#comment-16340821 ] Kaige Liu commented on KYLIN-3201: --- Hi [~keithxchen], what's your Hadoop distribution? > java.lang.RuntimeException: native lz4 library not available > > > Key: KYLIN-3201 > URL: https://issues.apache.org/jira/browse/KYLIN-3201 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.1.0 >Reporter: chenwenhao >Priority: Critical > > When i build cube with spark , the job was failed. > it report some exceptions about lz4, but i do not set lz4 properties. > > 18/01/26 13:18:40 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 > (TID 0, executor 1): java.lang.RuntimeException: native lz4 library not > available > at > org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195) > at > org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983) > at > org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1878) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1827) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1841) > at > org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:49) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:64) > at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:252) > at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:251) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:211) > at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:102) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkCubingByLayer > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) > Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in > stage 0.0 (TID 9, executor 1): java.lang.RuntimeException: native lz4 library > not available > at > org.apache.hadoop.io.compress.Lz4Codec.getDecompressorType(Lz4Codec.java:195) > at > org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:176) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1983) > at >
[jira] [Updated] (KYLIN-2893) Missing zero check for totalHitFrequency in CuboidStats ctor
[ https://issues.apache.org/jira/browse/KYLIN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2893: - Attachment: KYLIN-2893.master.001.patch > Missing zero check for totalHitFrequency in CuboidStats ctor > > > Key: KYLIN-2893 > URL: https://issues.apache.org/jira/browse/KYLIN-2893 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Attachments: KYLIN-2893.master.001.patch > > > {code} > if (hitFrequencyMap.get(cuboid) != null) { > tmpCuboidHitProbabilityMap.put(cuboid, unitUncertainProb > + (1 - WEIGHT_FOR_UN_QUERY) * > hitFrequencyMap.get(cuboid) / totalHitFrequency); > {code} > We should check that totalHitFrequency is not zero before performing division. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-2893) Missing zero check for totalHitFrequency in CuboidStats ctor
[ https://issues.apache.org/jira/browse/KYLIN-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340650#comment-16340650 ] Kaige Liu commented on KYLIN-2893: --- Hi [~lidong_sjtu], would you please help review my patch. Thanks. > Missing zero check for totalHitFrequency in CuboidStats ctor > > > Key: KYLIN-2893 > URL: https://issues.apache.org/jira/browse/KYLIN-2893 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Attachments: KYLIN-2893.master.001.patch > > > {code} > if (hitFrequencyMap.get(cuboid) != null) { > tmpCuboidHitProbabilityMap.put(cuboid, unitUncertainProb > + (1 - WEIGHT_FOR_UN_QUERY) * > hitFrequencyMap.get(cuboid) / totalHitFrequency); > {code} > We should check that totalHitFrequency is not zero before performing division. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339214#comment-16339214 ] Kaige Liu commented on KYLIN-3011: --- Hi [~yaho], any update? > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3011: - Priority: Minor (was: Critical) > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
[ https://issues.apache.org/jira/browse/KYLIN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3196: - Attachment: (was: KYLIN-3196-replace-containsOnly-1.patch) > Replace StringUtils.containsOnly with Regex > --- > > Key: KYLIN-3196 > URL: https://issues.apache.org/jira/browse/KYLIN-3196 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3196-replace-containsOnly-1.patch > > > Notice that we use StringUtils.contains to validate project/cube/model names. > It's not high efficiency and elegant. > > I did a small test: > {code:java} > public class TempTest { > Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); > @Test > public void test() { > final char[] VALID_MODELNAME = > "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); > String s1 = "abc@"; > System.out.println("Call StringUtils.containsOnly 100 times"); > long start = System.nanoTime(); > for(int i =0; i<100; ++i) { > StringUtils.containsOnly(s1, VALID_MODELNAME); > } > long end = System.nanoTime(); > System.out.println(end - start); > System.out.println("Call Regex match 100 times"); > start = System.nanoTime(); > for(int i =0; i<100; ++i) { > containsByRegex(s1); > } > end = System.nanoTime(); > System.out.println(end - start); > } > private boolean containsByRegex(final String s) { > Matcher matcher = r.matcher(s); > return matcher.find(); > } > }{code} > The result shows: > {code:java} > Call StringUtils.containsOnly 100 times > 4740997 > Call Regex match 100 times > 753182 > {code} > > Conclusion: > Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
[ https://issues.apache.org/jira/browse/KYLIN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3196: - Attachment: KYLIN-3196-replace-containsOnly-1.patch > Replace StringUtils.containsOnly with Regex > --- > > Key: KYLIN-3196 > URL: https://issues.apache.org/jira/browse/KYLIN-3196 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3196-replace-containsOnly-1.patch > > > Notice that we use StringUtils.contains to validate project/cube/model names. > It's not high efficiency and elegant. > > I did a small test: > {code:java} > public class TempTest { > Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); > @Test > public void test() { > final char[] VALID_MODELNAME = > "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); > String s1 = "abc@"; > System.out.println("Call StringUtils.containsOnly 100 times"); > long start = System.nanoTime(); > for(int i =0; i<100; ++i) { > StringUtils.containsOnly(s1, VALID_MODELNAME); > } > long end = System.nanoTime(); > System.out.println(end - start); > System.out.println("Call Regex match 100 times"); > start = System.nanoTime(); > for(int i =0; i<100; ++i) { > containsByRegex(s1); > } > end = System.nanoTime(); > System.out.println(end - start); > } > private boolean containsByRegex(final String s) { > Matcher matcher = r.matcher(s); > return matcher.find(); > } > }{code} > The result shows: > {code:java} > Call StringUtils.containsOnly 100 times > 4740997 > Call Regex match 100 times > 753182 > {code} > > Conclusion: > Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
[ https://issues.apache.org/jira/browse/KYLIN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339173#comment-16339173 ] Kaige Liu commented on KYLIN-3196: --- Patch attached. [~lidong_sjtu] Please help review. Thanks. > Replace StringUtils.containsOnly with Regex > --- > > Key: KYLIN-3196 > URL: https://issues.apache.org/jira/browse/KYLIN-3196 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3196-replace-containsOnly-1.patch > > > Notice that we use StringUtils.contains to validate project/cube/model names. > It's not high efficiency and elegant. > > I did a small test: > {code:java} > public class TempTest { > Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); > @Test > public void test() { > final char[] VALID_MODELNAME = > "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); > String s1 = "abc@"; > System.out.println("Call StringUtils.containsOnly 100 times"); > long start = System.nanoTime(); > for(int i =0; i<100; ++i) { > StringUtils.containsOnly(s1, VALID_MODELNAME); > } > long end = System.nanoTime(); > System.out.println(end - start); > System.out.println("Call Regex match 100 times"); > start = System.nanoTime(); > for(int i =0; i<100; ++i) { > containsByRegex(s1); > } > end = System.nanoTime(); > System.out.println(end - start); > } > private boolean containsByRegex(final String s) { > Matcher matcher = r.matcher(s); > return matcher.find(); > } > }{code} > The result shows: > {code:java} > Call StringUtils.containsOnly 100 times > 4740997 > Call Regex match 100 times > 753182 > {code} > > Conclusion: > Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
[ https://issues.apache.org/jira/browse/KYLIN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3196: - Attachment: KYLIN-3196-replace-containsOnly-1.patch > Replace StringUtils.containsOnly with Regex > --- > > Key: KYLIN-3196 > URL: https://issues.apache.org/jira/browse/KYLIN-3196 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3196-replace-containsOnly-1.patch > > > Notice that we use StringUtils.contains to validate project/cube/model names. > It's not high efficiency and elegant. > > I did a small test: > {code:java} > public class TempTest { > Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); > @Test > public void test() { > final char[] VALID_MODELNAME = > "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); > String s1 = "abc@"; > System.out.println("Call StringUtils.containsOnly 100 times"); > long start = System.nanoTime(); > for(int i =0; i<100; ++i) { > StringUtils.containsOnly(s1, VALID_MODELNAME); > } > long end = System.nanoTime(); > System.out.println(end - start); > System.out.println("Call Regex match 100 times"); > start = System.nanoTime(); > for(int i =0; i<100; ++i) { > containsByRegex(s1); > } > end = System.nanoTime(); > System.out.println(end - start); > } > private boolean containsByRegex(final String s) { > Matcher matcher = r.matcher(s); > return matcher.find(); > } > }{code} > The result shows: > {code:java} > Call StringUtils.containsOnly 100 times > 4740997 > Call Regex match 100 times > 753182 > {code} > > Conclusion: > Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
[ https://issues.apache.org/jira/browse/KYLIN-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3196: - Description: Notice that we use StringUtils.contains to validate project/cube/model names. It's not high efficiency and elegant. I did a small test: {code:java} public class TempTest { Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); @Test public void test() { final char[] VALID_MODELNAME = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); String s1 = "abc@"; System.out.println("Call StringUtils.containsOnly 100 times"); long start = System.nanoTime(); for(int i =0; i<100; ++i) { StringUtils.containsOnly(s1, VALID_MODELNAME); } long end = System.nanoTime(); System.out.println(end - start); System.out.println("Call Regex match 100 times"); start = System.nanoTime(); for(int i =0; i<100; ++i) { containsByRegex(s1); } end = System.nanoTime(); System.out.println(end - start); } private boolean containsByRegex(final String s) { Matcher matcher = r.matcher(s); return matcher.find(); } }{code} The result shows: {code:java} Call StringUtils.containsOnly 100 times 4740997 Call Regex match 100 times 753182 {code} Conclusion: Regex is better than StringUtils.containsOnly was: Notice that we use StringUtils.contains to validate project/cube/model names. It's not high efficiency and elegant. I did a small test: {code:java} public class TempTest { Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); @Test public void test() { final char[] VALID_MODELNAME = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); String s1 = "abc@"; System.out.println("Call StringUtils.containsOnly 100 times"); long start = System.nanoTime(); for(int i =0; i<100; ++i) { StringUtils.containsOnly(s1); } long end = System.nanoTime(); System.out.println(end - start); System.out.println("Call Regex match 100 times"); start = System.nanoTime(); for(int i =0; i<100; ++i) { containsByRegex(s1); } end = System.nanoTime(); System.out.println(end - start); } private boolean containsByRegex(final String s) { Matcher matcher = r.matcher(s); return matcher.find(); } }{code} The result shows: {code:java} Call StringUtils.containsOnly 100 times 4740997 Call Regex match 100 times 753182 {code} Conclusion: Regex is better than StringUtils.containsOnly > Replace StringUtils.containsOnly with Regex > --- > > Key: KYLIN-3196 > URL: https://issues.apache.org/jira/browse/KYLIN-3196 > Project: Kylin > Issue Type: Bug > Components: REST Service >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > > Notice that we use StringUtils.contains to validate project/cube/model names. > It's not high efficiency and elegant. > > I did a small test: > {code:java} > public class TempTest { > Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); > @Test > public void test() { > final char[] VALID_MODELNAME = > "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); > String s1 = "abc@"; > System.out.println("Call StringUtils.containsOnly 100 times"); > long start = System.nanoTime(); > for(int i =0; i<100; ++i) { > StringUtils.containsOnly(s1, VALID_MODELNAME); > } > long end = System.nanoTime(); > System.out.println(end - start); > System.out.println("Call Regex match 100 times"); > start = System.nanoTime(); > for(int i =0; i<100; ++i) { > containsByRegex(s1); > } > end = System.nanoTime(); > System.out.println(end - start); > } > private boolean containsByRegex(final String s) { > Matcher matcher = r.matcher(s); > return matcher.find(); > } > }{code} > The result shows: > {code:java} > Call StringUtils.containsOnly 100 times > 4740997 > Call Regex match 100 times > 753182 > {code} > > Conclusion: > Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3196) Replace StringUtils.containsOnly with Regex
Kaige Liu created KYLIN-3196: - Summary: Replace StringUtils.containsOnly with Regex Key: KYLIN-3196 URL: https://issues.apache.org/jira/browse/KYLIN-3196 Project: Kylin Issue Type: Bug Components: REST Service Reporter: Kaige Liu Assignee: Kaige Liu Fix For: v2.3.0 Notice that we use StringUtils.contains to validate project/cube/model names. It's not high efficiency and elegant. I did a small test: {code:java} public class TempTest { Pattern r = Pattern.compile("^[a-zA-Z0-9_]*$"); @Test public void test() { final char[] VALID_MODELNAME = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890_".toCharArray(); String s1 = "abc@"; System.out.println("Call StringUtils.containsOnly 100 times"); long start = System.nanoTime(); for(int i =0; i<100; ++i) { StringUtils.containsOnly(s1); } long end = System.nanoTime(); System.out.println(end - start); System.out.println("Call Regex match 100 times"); start = System.nanoTime(); for(int i =0; i<100; ++i) { containsByRegex(s1); } end = System.nanoTime(); System.out.println(end - start); } private boolean containsByRegex(final String s) { Matcher matcher = r.matcher(s); return matcher.find(); } }{code} The result shows: {code:java} Call StringUtils.containsOnly 100 times 4740997 Call Regex match 100 times 753182 {code} Conclusion: Regex is better than StringUtils.containsOnly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3193) Prevent users cloning models across projects
[ https://issues.apache.org/jira/browse/KYLIN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3193: - Attachment: (was: KYLIN-3193-forbit-across-prj-1.patch) > Prevent users cloning models across projects > > > Key: KYLIN-3193 > URL: https://issues.apache.org/jira/browse/KYLIN-3193 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3193-forbit-across-prj-2.patch > > > Nowadays, data sources and tables are separated by projects in Kylin. So > cloning models across projects will leads to tables not found. Should prevent > users performing the action. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3193) Prevent users cloning models across projects
[ https://issues.apache.org/jira/browse/KYLIN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338885#comment-16338885 ] Kaige Liu commented on KYLIN-3193: --- Patch updated. > Prevent users cloning models across projects > > > Key: KYLIN-3193 > URL: https://issues.apache.org/jira/browse/KYLIN-3193 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3193-forbit-across-prj-1.patch, > KYLIN-3193-forbit-across-prj-2.patch > > > Nowadays, data sources and tables are separated by projects in Kylin. So > cloning models across projects will leads to tables not found. Should prevent > users performing the action. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3193) Prevent users cloning models across projects
[ https://issues.apache.org/jira/browse/KYLIN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3193: - Attachment: KYLIN-3193-forbit-across-prj-2.patch > Prevent users cloning models across projects > > > Key: KYLIN-3193 > URL: https://issues.apache.org/jira/browse/KYLIN-3193 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3193-forbit-across-prj-1.patch, > KYLIN-3193-forbit-across-prj-2.patch > > > Nowadays, data sources and tables are separated by projects in Kylin. So > cloning models across projects will leads to tables not found. Should prevent > users performing the action. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KYLIN-3193) Prevent users cloning models across projects
[ https://issues.apache.org/jira/browse/KYLIN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338851#comment-16338851 ] Kaige Liu commented on KYLIN-3193: --- Patch attached. [~lidong_sjtu], please help review. Thanks. > Prevent users cloning models across projects > > > Key: KYLIN-3193 > URL: https://issues.apache.org/jira/browse/KYLIN-3193 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3193-forbit-across-prj-1.patch > > > Nowadays, data sources and tables are separated by projects in Kylin. So > cloning models across projects will leads to tables not found. Should prevent > users performing the action. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KYLIN-3193) Prevent users cloning models across projects
[ https://issues.apache.org/jira/browse/KYLIN-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3193: - Attachment: KYLIN-3193-forbit-across-prj-1.patch > Prevent users cloning models across projects > > > Key: KYLIN-3193 > URL: https://issues.apache.org/jira/browse/KYLIN-3193 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.2.0 >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3193-forbit-across-prj-1.patch > > > Nowadays, data sources and tables are separated by projects in Kylin. So > cloning models across projects will leads to tables not found. Should prevent > users performing the action. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312634#comment-16312634 ] Kaige Liu edited comment on KYLIN-3011 at 1/5/18 7:43 AM: --- Maybe we could change the prefix of intermediate table from "_*kylin_intermediate_*_" to *kylin_intermediate_metadataName*, so that we could use *metadataName* to filter intermediate tables. But this will leads to incompatible when upgrading. Any idea about this? [~Shaofengshi] [~yaho] was (Author: liukaige): Maybe we could change the prefix of intermediate table from "_*kylin_intermediate_*_" to _*kylin_intermediate_{metadata}*_, so that we could use {metadata} to filter intermediate tables. But this will leads to incompatible when upgrading. Any idea about this? [~Shaofengshi] [~yaho] > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312634#comment-16312634 ] Kaige Liu commented on KYLIN-3011: --- Maybe we could change the prefix of intermediate table from "_*kylin_intermediate_*_" to _*kylin_intermediate_{metadata}*_, so that we could use {metadata} to filter intermediate tables. But this will leads to incompatible when upgrading. Any idea about this? [~Shaofengshi] [~yaho] > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312607#comment-16312607 ] Kaige Liu commented on KYLIN-3011: --- Yes, this happens when multiple Kylin instances use the same Hive database to hold intermediate tables. A workaround is separating intermediate tables of different Kylin instances in different hive databases by specifying _*kylin.source.hive.database-for-flat-table*_. > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-2715) StorageCleanupJob removes intermediate hive tables of jobs in progress
[ https://issues.apache.org/jira/browse/KYLIN-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2715: - Assignee: Kaige Liu > StorageCleanupJob removes intermediate hive tables of jobs in progress > -- > > Key: KYLIN-2715 > URL: https://issues.apache.org/jira/browse/KYLIN-2715 > Project: Kylin > Issue Type: Bug >Affects Versions: v2.0.0 >Reporter: Alexander Sterligov >Assignee: Kaige Liu > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-3011) Tool StorageCleanupJob will cleanup other environment's intermediate hive tables which are using
[ https://issues.apache.org/jira/browse/KYLIN-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3011: - Assignee: Kaige Liu > Tool StorageCleanupJob will cleanup other environment's intermediate hive > tables which are using > > > Key: KYLIN-3011 > URL: https://issues.apache.org/jira/browse/KYLIN-3011 > Project: Kylin > Issue Type: Bug >Reporter: Zhong Yanghong >Assignee: Kaige Liu >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2873) relate to KYLIN-1351, no document for configure rdbms as datasource
[ https://issues.apache.org/jira/browse/KYLIN-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310668#comment-16310668 ] Kaige Liu commented on KYLIN-2873: --- I will file a document for configuring RDBMS as data source. > relate to KYLIN-1351, no document for configure rdbms as datasource > --- > > Key: KYLIN-2873 > URL: https://issues.apache.org/jira/browse/KYLIN-2873 > Project: Kylin > Issue Type: Improvement > Components: Documentation >Affects Versions: v2.1.0 >Reporter: Maxy >Assignee: Kaige Liu > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-2873) relate to KYLIN-1351, no document for configure rdbms as datasource
[ https://issues.apache.org/jira/browse/KYLIN-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2873: - Assignee: Kaige Liu > relate to KYLIN-1351, no document for configure rdbms as datasource > --- > > Key: KYLIN-2873 > URL: https://issues.apache.org/jira/browse/KYLIN-2873 > Project: Kylin > Issue Type: Improvement > Components: Documentation >Affects Versions: v2.1.0 >Reporter: Maxy >Assignee: Kaige Liu > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3081) Ineffective null check in CubeController#cuboidsExport
[ https://issues.apache.org/jira/browse/KYLIN-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3081: - Attachment: KYLIN-3081-format-code.patch Fixed code style issue. [~Shaofengshi] please review. Thanks. > Ineffective null check in CubeController#cuboidsExport > -- > > Key: KYLIN-3081 > URL: https://issues.apache.org/jira/browse/KYLIN-3081 > Project: Kylin > Issue Type: Bug > Components: Metadata >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Fix For: v2.3.0 > > Attachments: KYLIN-3081-fix-potential-npe-02.patch, > KYLIN-3081-format-code.patch > > > {code} > if (cuboidList == null || cuboidList.isEmpty()) { > logger.warn("Cannot get recommend cuboid list for cube " + > cubeName); > } > if (cuboidList.size() < top) { > logger.info("Only recommend " + cuboidList.size() + " cuboids > less than topn " + top); > } > {code} > cuboidList.size() may result in NPE because the null check above doesn't have > effect. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-1925) Do not allow cross project clone for cube
[ https://issues.apache.org/jira/browse/KYLIN-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-1925: - Attachment: (was: KYLIN-1925-forbid-clone-acrossprojects.patch) > Do not allow cross project clone for cube > - > > Key: KYLIN-1925 > URL: https://issues.apache.org/jira/browse/KYLIN-1925 > Project: Kylin > Issue Type: Improvement > Components: REST Service, Web >Affects Versions: v1.5.3 >Reporter: Zhong,Jason >Assignee: Kaige Liu > Fix For: v2.3.0 > > Attachments: KYLIN-1925-forbid-clone-acrossprojects-02.patch > > > Currently we should only support clone cube in one project, cross project is > not allowed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-1925) Do not allow cross project clone for cube
[ https://issues.apache.org/jira/browse/KYLIN-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-1925: - Attachment: KYLIN-1925-forbid-clone-acrossprojects-02.patch Thanks for reminder [~Shaofengshi] , already split. > Do not allow cross project clone for cube > - > > Key: KYLIN-1925 > URL: https://issues.apache.org/jira/browse/KYLIN-1925 > Project: Kylin > Issue Type: Improvement > Components: REST Service, Web >Affects Versions: v1.5.3 >Reporter: Zhong,Jason >Assignee: Kaige Liu > Fix For: v2.3.0 > > Attachments: KYLIN-1925-forbid-clone-acrossprojects-02.patch, > KYLIN-1925-forbid-clone-acrossprojects.patch > > > Currently we should only support clone cube in one project, cross project is > not allowed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-1925) Do not allow cross project clone for cube
[ https://issues.apache.org/jira/browse/KYLIN-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-1925: - Attachment: KYLIN-1925-forbid-clone-acrossprojects.patch Patch attached. [~Shaofengshi] Please help review. Thanks. It looks the patch in KYLIN-3081 still has format issue somehow. I fix it in this patch as well. > Do not allow cross project clone for cube > - > > Key: KYLIN-1925 > URL: https://issues.apache.org/jira/browse/KYLIN-1925 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v1.5.3 >Reporter: Zhong,Jason >Assignee: Kaige Liu > Fix For: Backlog > > Attachments: KYLIN-1925-forbid-clone-acrossprojects.patch > > > Currently we should only support clone cube in one project, cross project is > not allowed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-1925) Do not allow cross project clone for cube
[ https://issues.apache.org/jira/browse/KYLIN-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-1925: - Assignee: Kaige Liu (was: Zhong,Jason) > Do not allow cross project clone for cube > - > > Key: KYLIN-1925 > URL: https://issues.apache.org/jira/browse/KYLIN-1925 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v1.5.3 >Reporter: Zhong,Jason >Assignee: Kaige Liu > Fix For: Backlog > > > Currently we should only support clone cube in one project, cross project is > not allowed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3081) Ineffective null check in CubeController#cuboidsExport
[ https://issues.apache.org/jira/browse/KYLIN-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3081: - Attachment: KYLIN-3081-fix-potential-npe-02.patch Ah...sorry about that. Already updated. Thanks [~Shaofengshi] > Ineffective null check in CubeController#cuboidsExport > -- > > Key: KYLIN-3081 > URL: https://issues.apache.org/jira/browse/KYLIN-3081 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Attachments: KYLIN-3081-fix-potential-npe-02.patch, > KYLIN-3081-fix-potential-npe.patch > > > {code} > if (cuboidList == null || cuboidList.isEmpty()) { > logger.warn("Cannot get recommend cuboid list for cube " + > cubeName); > } > if (cuboidList.size() < top) { > logger.info("Only recommend " + cuboidList.size() + " cuboids > less than topn " + top); > } > {code} > cuboidList.size() may result in NPE because the null check above doesn't have > effect. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3146) Response code and exception should be standardised for cube checking
[ https://issues.apache.org/jira/browse/KYLIN-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309084#comment-16309084 ] Kaige Liu commented on KYLIN-3146: --- I think if we narrow the scope to REST API, 404 might be a better choice for "cube not found". Cube name is part of URI, a 404 error will give client a clear hint. According to [RFC2616|https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html]: {quote}10.4.1 400 Bad Request The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.{quote} {quote}10.4.5 404 Not Found The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.{quote} > Response code and exception should be standardised for cube checking > --- > > Key: KYLIN-3146 > URL: https://issues.apache.org/jira/browse/KYLIN-3146 > Project: Kylin > Issue Type: Improvement >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > > Checking if cubes exist or not is a common behaviour in some APIs. But we > have lots of different responses for the same behaviour. > Let's take CubeController as an example. When can not find a cube with its > name, someone gives a *400* as response code, someone returns *404*, and > others send back a *500*. Not only HTTP response code is not standard, which > kind of exception should be thrown is not unified as well. Still using the > above example, we can find *IllegalArgumentException*, *BadRequestException*, > *InternalErrorException*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (KYLIN-3147) Response code and exception should be standardised for cube checking
[ https://issues.apache.org/jira/browse/KYLIN-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu closed KYLIN-3147. Resolution: Duplicate > Response code and exception should be standardised for cube checking > --- > > Key: KYLIN-3147 > URL: https://issues.apache.org/jira/browse/KYLIN-3147 > Project: Kylin > Issue Type: Improvement >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > > Checking if cubes exist or not is a common behaviour in some APIs. But we > have lots of different responses for the same behaviour. > Let's take CubeController as an example. When can not find a cube with its > name, someone gives a *400* as response code, someone returns *404*, and > others send back a *500*. Not only HTTP response code is not standard, which > kind of exception should be thrown is not unified as well. Still using the > above example, we can find *IllegalArgumentException*, *BadRequestException*, > *InternalErrorException*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3147) Response code and exception should be standardised for cube checking
[ https://issues.apache.org/jira/browse/KYLIN-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309072#comment-16309072 ] Kaige Liu commented on KYLIN-3147: --- Double submit due to network issue. Duplicated with KYLIN-3146. > Response code and exception should be standardised for cube checking > --- > > Key: KYLIN-3147 > URL: https://issues.apache.org/jira/browse/KYLIN-3147 > Project: Kylin > Issue Type: Improvement >Reporter: Kaige Liu >Assignee: Kaige Liu >Priority: Minor > > Checking if cubes exist or not is a common behaviour in some APIs. But we > have lots of different responses for the same behaviour. > Let's take CubeController as an example. When can not find a cube with its > name, someone gives a *400* as response code, someone returns *404*, and > others send back a *500*. Not only HTTP response code is not standard, which > kind of exception should be thrown is not unified as well. Still using the > above example, we can find *IllegalArgumentException*, *BadRequestException*, > *InternalErrorException*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3146) Response code and exception should be standardised for cube checking
Kaige Liu created KYLIN-3146: - Summary: Response code and exception should be standardised for cube checking Key: KYLIN-3146 URL: https://issues.apache.org/jira/browse/KYLIN-3146 Project: Kylin Issue Type: Improvement Reporter: Kaige Liu Assignee: Kaige Liu Priority: Minor Checking if cubes exist or not is a common behaviour in some APIs. But we have lots of different responses for the same behaviour. Let's take CubeController as an example. When can not find a cube with its name, someone gives a *400* as response code, someone returns *404*, and others send back a *500*. Not only HTTP response code is not standard, which kind of exception should be thrown is not unified as well. Still using the above example, we can find *IllegalArgumentException*, *BadRequestException*, *InternalErrorException*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3147) Response code and exception should be standardised for cube checking
Kaige Liu created KYLIN-3147: - Summary: Response code and exception should be standardised for cube checking Key: KYLIN-3147 URL: https://issues.apache.org/jira/browse/KYLIN-3147 Project: Kylin Issue Type: Improvement Reporter: Kaige Liu Assignee: Kaige Liu Priority: Minor Checking if cubes exist or not is a common behaviour in some APIs. But we have lots of different responses for the same behaviour. Let's take CubeController as an example. When can not find a cube with its name, someone gives a *400* as response code, someone returns *404*, and others send back a *500*. Not only HTTP response code is not standard, which kind of exception should be thrown is not unified as well. Still using the above example, we can find *IllegalArgumentException*, *BadRequestException*, *InternalErrorException*. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3081) Ineffective null check in CubeController#cuboidsExport
[ https://issues.apache.org/jira/browse/KYLIN-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3081: - Attachment: KYLIN-3081-fix-potential-npe.patch Patch attached. [~Shaofengshi] [~lidong_sjtu] please help review. Thanks. > Ineffective null check in CubeController#cuboidsExport > -- > > Key: KYLIN-3081 > URL: https://issues.apache.org/jira/browse/KYLIN-3081 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Attachments: KYLIN-3081-fix-potential-npe.patch > > > {code} > if (cuboidList == null || cuboidList.isEmpty()) { > logger.warn("Cannot get recommend cuboid list for cube " + > cubeName); > } > if (cuboidList.size() < top) { > logger.info("Only recommend " + cuboidList.size() + " cuboids > less than topn " + top); > } > {code} > cuboidList.size() may result in NPE because the null check above doesn't have > effect. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-3095) Use ArrayDeque instead of LinkedList for queue implementation
[ https://issues.apache.org/jira/browse/KYLIN-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3095: - Assignee: Kaige Liu > Use ArrayDeque instead of LinkedList for queue implementation > - > > Key: KYLIN-3095 > URL: https://issues.apache.org/jira/browse/KYLIN-3095 > Project: Kylin > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > > Use ArrayDeque instead of LinkedList for queue implementation where thread > safety is not needed. > https://docs.oracle.com/javase/8/docs/api/index.html?java/util/ArrayDeque.html > {quote} > Resizable-array implementation of the Deque interface. Array deques have no > capacity restrictions; they grow as necessary to support usage. They are not > thread-safe; in the absence of external synchronization, they do not support > concurrent access by multiple threads. Null elements are prohibited. This > class is likely to be faster than Stack when used as a stack, and *faster > than LinkedList when used as a queue.* > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-3132) Add Custom Date Time Partition
[ https://issues.apache.org/jira/browse/KYLIN-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307611#comment-16307611 ] Kaige Liu commented on KYLIN-3132: --- Thanks Vu for reporting this. I will take this JIRA. > Add Custom Date Time Partition > -- > > Key: KYLIN-3132 > URL: https://issues.apache.org/jira/browse/KYLIN-3132 > Project: Kylin > Issue Type: Task > Components: General >Affects Versions: v2.2.0 > Environment: KYLIN WEB >Reporter: vu thanh dat >Assignee: Kaige Liu > Fix For: v2.2.0 > > Attachments: partition_date.bmp > > > Hi all, > Im using Kylin and I want to add more custom partition for Datetime. > Now, kylin only has: -MM-dd, MMdd and -MM-dd HH:mm:ss. > How can I and such as: -MM-dd-HH into the kylin open source code because > my date is partitioned by hour string, ex: 2017-12-26-13 > Best regards, > Thanks! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-3132) Add Custom Date Time Partition
[ https://issues.apache.org/jira/browse/KYLIN-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-3132: - Assignee: Kaige Liu > Add Custom Date Time Partition > -- > > Key: KYLIN-3132 > URL: https://issues.apache.org/jira/browse/KYLIN-3132 > Project: Kylin > Issue Type: Task > Components: General >Affects Versions: v2.2.0 > Environment: KYLIN WEB >Reporter: vu thanh dat >Assignee: Kaige Liu > Fix For: v2.2.0 > > Attachments: partition_date.bmp > > > Hi all, > Im using Kylin and I want to add more custom partition for Datetime. > Now, kylin only has: -MM-dd, MMdd and -MM-dd HH:mm:ss. > How can I and such as: -MM-dd-HH into the kylin open source code because > my date is partitioned by hour string, ex: 2017-12-26-13 > Best regards, > Thanks! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262514#comment-16262514 ] Kaige Liu edited comment on KYLIN-3044 at 11/22/17 1:32 PM: - Sqoop splits data to a couple of parts and import them parallel. I add a property kylin.source.jdbc.sqoop-mapper-num to specify how many splits should be divided. Sqoop would run a mapper for each split. To make each mapper gets even input, split column is chosen by following some rules: 1. Prefer ClusteredBy column 2. Prefer DistributedBy column 3. Prefer Partition date column 4. Prefer Higher cardinality column 5. Prefer numeric column 6. Pick a column at first glance Patch updated. was (Author: liukaige): Sqoop splits data to a couple of parts and import them parallel. I add a property kylin.source.jdbc.sqoop-mapper-num to specify how many splits should be divided. Sqoop would run a mapper for each split. To make each mapper gets even input, split column is chosen following some rules: 1. Prefer ClusteredBy column 2. Prefer DistributedBy column 3. Prefer Partition date column 4. Prefer Higher cardinality column 5. Prefer numeric column 6. Pick a column at first glance Patch updated. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3044-sqlserver-as-datasource.patch, > KYLIN-3044-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3044: - Attachment: (was: KYLIN-3044-sqlserver-as-datasource.patch) > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3044-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3044: - Attachment: KYLIN-3044-sqlserver-as-datasource.patch Sqoop splits data to a couple of parts and import them parallel. I add a property kylin.source.jdbc.sqoop-mapper-num to specify how many splits should be divided. Sqoop would run a mapper for each split. To make each mapper gets even input, split column is chosen following some rules: 1. Prefer ClusteredBy column 2. Prefer DistributedBy column 3. Prefer Partition date column 4. Prefer Higher cardinality column 5. Prefer numeric column 6. Pick a column at first glance Patch updated. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3044-sqlserver-as-datasource.patch, > KYLIN-3044-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3044: - Attachment: (was: KYLIN-3304-sqlserver-as-datasource.patch) > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3044-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3044: - Attachment: KYLIN-3044-sqlserver-as-datasource.patch Sorry for the wrong JIRA id in commit message and patch file name, already updated. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3044-sqlserver-as-datasource.patch, > KYLIN-3304-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KYLIN-2987) Skip moving to Trash when drop an intermediate hive table or redistribute a hive table
[ https://issues.apache.org/jira/browse/KYLIN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262163#comment-16262163 ] Kaige Liu commented on KYLIN-2987: --- Hi [~yaho], seems this patch does not work. According to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertOverwrite auto.purge=true only works for MANAGED table, but KYLIN intermediate hive tables are created as EXTERNAL table. > Skip moving to Trash when drop an intermediate hive table or redistribute a > hive table > -- > > Key: KYLIN-2987 > URL: https://issues.apache.org/jira/browse/KYLIN-2987 > Project: Kylin > Issue Type: Improvement >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong >Priority: Trivial > Fix For: v2.3.0 > > Attachments: APACHE-KYLIN-2987.patch > > > At kylin side, we can add auto.purge=true when creating intermediate table. > However, to make ‘auto.purge’ effective for “insert overwrite table”, we > still need one patch for hive. > https://issues.apache.org/jira/browse/HIVE-15880 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262000#comment-16262000 ] Kaige Liu edited comment on KYLIN-3044 at 11/22/17 6:12 AM: - To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=| JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. Add sqoop needs the jdbc driver as well, users should also add jdbc driver to $SQOOP_HOME/lib [~liyang.g...@gmail.com], [~Shaofengshi] please help review my patch. Thanks. was (Author: liukaige): To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=| JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com], [~Shaofengshi] please help review my patch. Thanks. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3304-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-3044: - Attachment: KYLIN-3304-sqlserver-as-datasource.patch To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: *kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=|* JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com] [~Shaofengshi] please help review my patch. Thanks. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3304-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262000#comment-16262000 ] Kaige Liu edited comment on KYLIN-3044 at 11/22/17 5:51 AM: - To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=| JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com], [~Shaofengshi] please help review my patch. Thanks. was (Author: liukaige): To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: *kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=|* JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com], [~Shaofengshi] please help review my patch. Thanks. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3304-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KYLIN-3044) Support SQL Server as data source
[ https://issues.apache.org/jira/browse/KYLIN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16262000#comment-16262000 ] Kaige Liu edited comment on KYLIN-3044 at 11/22/17 5:51 AM: - To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: *kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=|* JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com], [~Shaofengshi] please help review my patch. Thanks. was (Author: liukaige): To use SQL Server as KYLIN data source, following properties should be added to kylin.properties: *kylin.source.jdbc.connection-url=jdbc:sqlserver://youdbhost:1433;database=sample kylin.source.jdbc.driver=com.microsoft.sqlserver.jdbc.SQLServerDriver kylin.source.jdbc.dialect=mssql kylin.source.jdbc.user=user kylin.source.jdbc.pass=pass kylin.source.jdbc.sqoop-home=/usr/hdp/current/sqoop-client/bin kylin.source.default=8 kylin.source.jdbc.filed-delimiter=|* JDBC driver will not be shipped by KYLIN. Users should add proper driver by themselves. For release package, jdbc driver jar should be added to $KYLIN_HOME/ext For IDE, add jdbc driver jar to class path manually. [~liyang.g...@gmail.com] [~Shaofengshi] please help review my patch. Thanks. > Support SQL Server as data source > - > > Key: KYLIN-3044 > URL: https://issues.apache.org/jira/browse/KYLIN-3044 > Project: Kylin > Issue Type: Task >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-3304-sqlserver-as-datasource.patch > > > [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added > Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable > SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-3044) Support SQL Server as data source
Kaige Liu created KYLIN-3044: - Summary: Support SQL Server as data source Key: KYLIN-3044 URL: https://issues.apache.org/jira/browse/KYLIN-3044 Project: Kylin Issue Type: Task Reporter: Kaige Liu Assignee: Kaige Liu [KYLIN-1351|https://issues.apache.org/jira/browse/KYLIN-1351] has added Vertica as data source. Base on the work of KYLIN-1351, I'd like to enable SQL Server as data source of kylin. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (KYLIN-2747) Fail to start kylin if current work directory contains file or directory named "hadoop"
[ https://issues.apache.org/jira/browse/KYLIN-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu closed KYLIN-2747. Resolution: Invalid > Fail to start kylin if current work directory contains file or directory > named "hadoop" > --- > > Key: KYLIN-2747 > URL: https://issues.apache.org/jira/browse/KYLIN-2747 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > Labels: scope > > OS: ubuntu 14.04 > Reproduce steps: > 1. touch hadoop > 2. bin/kylin.sh start > root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# touch hadoop > root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# bin/kylin.sh start > Retrieving hive dependency... > Retrieving Spark dependency... > Retrieving hbase dependency... > Exception in thread "main" java.io.FileNotFoundException: > /tmp/kylin-env-diff-7420732488608534086.sh.props (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at java.io.FileInputStream.(FileInputStream.java:93) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.readAndDelete(SourceFile:131) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.diffSystemProps(SourceFile:109) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.main(SourceFile:69) > Faild to run io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KYLIN-2747) Fail to start kylin if current work directory contains file or directory named "hadoop"
[ https://issues.apache.org/jira/browse/KYLIN-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2747: - Assignee: Kaige Liu > Fail to start kylin if current work directory contains file or directory > named "hadoop" > --- > > Key: KYLIN-2747 > URL: https://issues.apache.org/jira/browse/KYLIN-2747 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > > OS: ubuntu 14.04 > Reproduce steps: > 1. touch hadoop > 2. bin/kylin.sh start > root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# touch hadoop > root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# bin/kylin.sh start > Retrieving hive dependency... > Retrieving Spark dependency... > Retrieving hbase dependency... > Exception in thread "main" java.io.FileNotFoundException: > /tmp/kylin-env-diff-7420732488608534086.sh.props (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.(FileInputStream.java:138) > at java.io.FileInputStream.(FileInputStream.java:93) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.readAndDelete(SourceFile:131) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.diffSystemProps(SourceFile:109) > at > io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.main(SourceFile:69) > Faild to run io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2747) Fail to start kylin if current work directory contains file or directory named "hadoop"
Kaige Liu created KYLIN-2747: - Summary: Fail to start kylin if current work directory contains file or directory named "hadoop" Key: KYLIN-2747 URL: https://issues.apache.org/jira/browse/KYLIN-2747 Project: Kylin Issue Type: Bug Reporter: Kaige Liu OS: ubuntu 14.04 Reproduce steps: 1. touch hadoop 2. bin/kylin.sh start root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# touch hadoop root@hn0-ambari:~/kap-2.4.0-GA-hbase1.x# bin/kylin.sh start Retrieving hive dependency... Retrieving Spark dependency... Retrieving hbase dependency... Exception in thread "main" java.io.FileNotFoundException: /tmp/kylin-env-diff-7420732488608534086.sh.props (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.(FileInputStream.java:138) at java.io.FileInputStream.(FileInputStream.java:93) at io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.readAndDelete(SourceFile:131) at io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.diffSystemProps(SourceFile:109) at io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps.main(SourceFile:69) Faild to run io.kyligence.kap.engine.mr.tool.DumpHadoopSystemProps -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KYLIN-2536) Replace the use of org.codehaus.jackson
[ https://issues.apache.org/jira/browse/KYLIN-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2536: - Attachment: KYLIN-2536-replace-codehaus.patch Please help review this patch. Thanks. > Replace the use of org.codehaus.jackson > --- > > Key: KYLIN-2536 > URL: https://issues.apache.org/jira/browse/KYLIN-2536 > Project: Kylin > Issue Type: Bug >Reporter: Ted Yu >Assignee: Kaige Liu >Priority: Minor > Attachments: KYLIN-2536-replace-codehaus.patch > > > {code} > engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopStatusGetter.java:import > org.codehaus.jackson.JsonNode; > engine-mr/src/main/java/org/apache/kylin/engine/mr/common/HadoopStatusGetter.java:import > org.codehaus.jackson.map.ObjectMapper; > {code} > com.fasterxml.jackson should be used instead -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2307) Make HBase 1.x the default of master
[ https://issues.apache.org/jira/browse/KYLIN-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885003#comment-15885003 ] Kaige Liu commented on KYLIN-2307: --- [~liyang.g...@gmail.com] Yep, it's done. Thanks yang! > Make HBase 1.x the default of master > > > Key: KYLIN-2307 > URL: https://issues.apache.org/jira/browse/KYLIN-2307 > Project: Kylin > Issue Type: Improvement >Reporter: liyang >Assignee: Kaige Liu > Fix For: v2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (KYLIN-2341) sum(case .. when ..) is not supported
[ https://issues.apache.org/jira/browse/KYLIN-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873684#comment-15873684 ] Kaige Liu edited comment on KYLIN-2341 at 2/19/17 12:42 PM: - [~liyang.g...@gmail.com] Would you please help review this patch? Thanks. was (Author: liukaige): [~liyang.g...@gmail.com]Would you please help review this patch? Thanks. > sum(case .. when ..) is not supported > - > > Key: KYLIN-2341 > URL: https://issues.apache.org/jira/browse/KYLIN-2341 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > Attachments: KYLIN-2341.patch > > > Query like below should either fail, or return correct result. Currently it > returns incorrect result. > {code} > SELECT > sum(case > when lstg_format_name like 'Other%' > then price > else 0 > end) as gmv > > FROM test_kylin_fact > inner JOIN edw.test_cal_dt as test_cal_dt > ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt > inner JOIN test_category_groupings > ON test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id > AND test_kylin_fact.lstg_site_id = test_category_groupings.site_id > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2341) sum(case .. when ..) is not supported
[ https://issues.apache.org/jira/browse/KYLIN-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2341: - Attachment: KYLIN-2341.patch [~liyang.g...@gmail.com]Would you please help review this patch? Thanks. > sum(case .. when ..) is not supported > - > > Key: KYLIN-2341 > URL: https://issues.apache.org/jira/browse/KYLIN-2341 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > Attachments: KYLIN-2341.patch > > > Query like below should either fail, or return correct result. Currently it > returns incorrect result. > {code} > SELECT > sum(case > when lstg_format_name like 'Other%' > then price > else 0 > end) as gmv > > FROM test_kylin_fact > inner JOIN edw.test_cal_dt as test_cal_dt > ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt > inner JOIN test_category_groupings > ON test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id > AND test_kylin_fact.lstg_site_id = test_category_groupings.site_id > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KYLIN-2341) sum(case .. when ..) is not supported
[ https://issues.apache.org/jira/browse/KYLIN-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2341: - Assignee: Kaige Liu (was: liyang) > sum(case .. when ..) is not supported > - > > Key: KYLIN-2341 > URL: https://issues.apache.org/jira/browse/KYLIN-2341 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > > Query like below should either fail, or return correct result. Currently it > returns incorrect result. > {code} > SELECT > sum(case > when lstg_format_name like 'Other%' > then price > else 0 > end) as gmv > > FROM test_kylin_fact > inner JOIN edw.test_cal_dt as test_cal_dt > ON test_kylin_fact.cal_dt = test_cal_dt.cal_dt > inner JOIN test_category_groupings > ON test_kylin_fact.leaf_categ_id = test_category_groupings.leaf_categ_id > AND test_kylin_fact.lstg_site_id = test_category_groupings.site_id > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2449) Rewrite should not run on OLAPAggregateRel if has no OLAPTable
[ https://issues.apache.org/jira/browse/KYLIN-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2449: - Attachment: KYLIN-2449.patch Patch Attached. Please help review. Thanks. > Rewrite should not run on OLAPAggregateRel if has no OLAPTable > -- > > Key: KYLIN-2449 > URL: https://issues.apache.org/jira/browse/KYLIN-2449 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-2449.patch > > > If a OLAPAggregateRel's context does not contain any OLAPTable, it's no need > to rewrite column. Otherwise a NPE will be threw, for example: > {code} > Caused by: java.lang.NullPointerException > at > org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteColumn(OLAPAggregateRel.java:217) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:340) > at > org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:259) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPSortRel.implementRewrite(OLAPSortRel.java:83) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:105) > at > org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) > at > org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:94) > at > org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) > at > org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) > at > org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1233) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:303) > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:200) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:761) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:617) > at > org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:587) > at > org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:215) > at > org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:594) > at > org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) > ... 35 more > {code} > Test case: > {code} > SELECT > t1.leaf_categ_id, COUNT(*) AS nums > FROM > (SELECT > leaf_categ_id > FROM > test_kylin_fact > WHERE > lstg_format_name = 'ABIN') t1 > JOIN > (SELECT > leaf_categ_id > FROM > test_kylin_fact f > INNER JOIN test_order o ON f.order_id = o.order_id > WHERE > buyer_id > 100) t2 ON t1.leaf_categ_id = t2.leaf_categ_id > GROUP BY t1.leaf_categ_id > ORDER BY t1.leaf_categ_id > LIMIT 10 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2407) TPC-H query 20, why this query returns no result?
[ https://issues.apache.org/jira/browse/KYLIN-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2407: - Attachment: KYLIN-2407.patch Patch attached. Please help review. Thanks. > TPC-H query 20, why this query returns no result? > - > > Key: KYLIN-2407 > URL: https://issues.apache.org/jira/browse/KYLIN-2407 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > Attachments: KYLIN-2407.patch > > > Below query returns no result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ), > tmp5 as ( > select > ps_suppkey > from > v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = > l_suppkey > where > ps_availqty > sum_quantity > ) > select > s_name, > s_address > from > supplier > where > s_suppkey IN (select ps_suppkey from tmp5) > order by s_name > {code} > While another similar query returns correct result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > Maybe something wrong with the "where ... IN ..." clause? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (KYLIN-2449) Rewrite should not run on OLAPAggregateRel if has no OLAPTable
Kaige Liu created KYLIN-2449: - Summary: Rewrite should not run on OLAPAggregateRel if has no OLAPTable Key: KYLIN-2449 URL: https://issues.apache.org/jira/browse/KYLIN-2449 Project: Kylin Issue Type: Bug Reporter: Kaige Liu Assignee: Kaige Liu If a OLAPAggregateRel's context does not contain any OLAPTable, it's no need to rewrite column. Otherwise a NPE will be threw, for example: {code} Caused by: java.lang.NullPointerException at org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteColumn(OLAPAggregateRel.java:217) at org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:340) at org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:259) at org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) at org.apache.kylin.query.relnode.OLAPSortRel.implementRewrite(OLAPSortRel.java:83) at org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) at org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:105) at org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) at org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:94) at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) at org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) at org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1233) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:303) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:200) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:761) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:617) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:587) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:215) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:594) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) ... 35 more {code} Test case: {code} SELECT t1.leaf_categ_id, COUNT(*) AS nums FROM (SELECT leaf_categ_id FROM test_kylin_fact WHERE lstg_format_name = 'ABIN') t1 JOIN (SELECT leaf_categ_id FROM test_kylin_fact f INNER JOIN test_order o ON f.order_id = o.order_id WHERE buyer_id > 100) t2 ON t1.leaf_categ_id = t2.leaf_categ_id GROUP BY t1.leaf_categ_id ORDER BY t1.leaf_categ_id LIMIT 10 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2407) TPC-H query 20, why this query returns no result?
[ https://issues.apache.org/jira/browse/KYLIN-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864864#comment-15864864 ] Kaige Liu commented on KYLIN-2407: --- IN clause is parsed to an inner join here.So that we got a "LOOKUP T" join "SUB QUERY" which should use *executeLookupTableQuery* to do the query, but was routed to *executeOLAPQuery*. {code} OLAPToEnumerableConverter EnumerableLimit(fetch=[5]) EnumerableSort(sort0=[$0], dir0=[ASC]) EnumerableCalc(expr#0..7=[{inputs}], S_NAME=[$t1], S_ADDRESS=[$t2]) EnumerableJoin(condition=[=($0, $7)], joinType=[inner]) <- IN clause here OLAPTableScan(table=[[TPCH_FLAT_ORC_2, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]]) <--- This is a lookup table EnumerableAggregate(group=[{0}]) EnumerableCalc(expr#0..10=[{inputs}], expr#11=[>($t2, $t9)], PS_SUPPKEY=[$t1], $condition=[$t11]) EnumerableJoin(condition=[AND(=($0, $8), =($1, $10))], joinType=[inner]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, V_PARTSUPP]], fields=[[0, 1, 2, 3, 4, 5, 6, 7]]) EnumerableCalc(expr#0..2=[{inputs}], expr#3=[0.5], expr#4=[*($t3, $t2)], L_PARTKEY=[$t0], SUM_QUANTITY=[$t4], L_SUPPKEY=[$t1]) EnumerableAggregate(group=[{0, 1}], agg#0=[SUM($2)]) EnumerableCalc(expr#0..36=[{inputs}], expr#37=['1992-01-01'], expr#38=[>=($t8, $t37)], expr#39=['1995-01-01'], expr#40=[<=($t8, $t39)], expr#41=['CANADA'], expr#42=[=($t28, $t41)], expr#43=['forest%'], expr#44=[LIKE($t31, $t43)], expr#45=[AND($t38, $t40, $t42, $t44)], L_PARTKEY=[$t1], L_SUPPKEY=[$t2], L_QUANTITY=[$t3], $condition=[$t45]) OLAPJoinRel(condition=[=($1, $30)], joinType=[inner]) OLAPJoinRel(condition=[=($23, $27)], joinType=[inner]) OLAPJoinRel(condition=[=($2, $20)], joinType=[inner]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, V_LINEITEM]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, NATION]], fields=[[0, 1, 2]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, PART]], fields=[[0, 1, 2, 3, 4, 5, 6]]) {code} The logic to route query method below is not correct: {code} private String genExecFunc() { // if the table to scan is not the fact table of cube, then it's a lookup table if (context.hasJoin == false && context.realization.getModel().isLookupTable(tableName)) { return "executeLookupTableQuery"; } else { return "executeOLAPQuery"; } } {code} > TPC-H query 20, why this query returns no result? > - > > Key: KYLIN-2407 > URL: https://issues.apache.org/jira/browse/KYLIN-2407 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > > Below query returns no result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ), > tmp5 as ( > select > ps_suppkey > from > v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = > l_suppkey > where > ps_availqty > sum_quantity > ) > select > s_name, > s_address > from > supplier > where > s_suppkey IN (select ps_suppkey from tmp5) > order by s_name > {code} > While another similar query returns correct result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > Maybe something wrong with the "where ... IN ..." clause? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KYLIN-2407) TPC-H query 20, why this query returns no result?
[ https://issues.apache.org/jira/browse/KYLIN-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2407: - Assignee: Kaige Liu > TPC-H query 20, why this query returns no result? > - > > Key: KYLIN-2407 > URL: https://issues.apache.org/jira/browse/KYLIN-2407 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > > Below query returns no result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ), > tmp5 as ( > select > ps_suppkey > from > v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = > l_suppkey > where > ps_availqty > sum_quantity > ) > select > s_name, > s_address > from > supplier > where > s_suppkey IN (select ps_suppkey from tmp5) > order by s_name > {code} > While another similar query returns correct result. > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > Maybe something wrong with the "where ... IN ..." clause? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2406) TPC-H query 20, can triggers NPE
[ https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2406: - Attachment: KYLIN-2406-fix-NPE.patch > TPC-H query 20, can triggers NPE > > > Key: KYLIN-2406 > URL: https://issues.apache.org/jira/browse/KYLIN-2406 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > Attachments: KYLIN-2406-fix-NPE.patch > > > Below query triggers NPE > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > inner join supplier on ps_suppkey = s_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > While below query is OK. Only difference being the order of "inner join tmp3" > and "inner join supplier" > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2406) TPC-H query 20, can triggers NPE
[ https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2406: - Attachment: KYLIN-2406-fix-NPE.patch For now this patch only prevent the NPE and give an error hint. A better solution will be given in KYLIN-2427 later. > TPC-H query 20, can triggers NPE > > > Key: KYLIN-2406 > URL: https://issues.apache.org/jira/browse/KYLIN-2406 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > Attachments: KYLIN-2406-fix-NPE.patch > > > Below query triggers NPE > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > inner join supplier on ps_suppkey = s_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > While below query is OK. Only difference being the order of "inner join tmp3" > and "inner join supplier" > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KYLIN-2427) Auto adjust join order to make query executable
[ https://issues.apache.org/jira/browse/KYLIN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2427: - Assignee: Kaige Liu > Auto adjust join order to make query executable > --- > > Key: KYLIN-2427 > URL: https://issues.apache.org/jira/browse/KYLIN-2427 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > > KYLIN-2406 reports an issue: The order of joins will affect the result of > query. For example, below query leads to "No model found" > Below query triggers NPE > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > inner join supplier on ps_suppkey = s_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > While below query is OK. Only difference being the order of "inner join tmp3" > and "inner join supplier" > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2427) Auto adjust join order to make query executable
[ https://issues.apache.org/jira/browse/KYLIN-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2427: - Description: KYLIN-2406 reports an issue: The order of joins will affect the result of query. For example, below query leads to "No model found" Below query triggers NPE {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey inner join supplier on ps_suppkey = s_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} While below query is OK. Only difference being the order of "inner join tmp3" and "inner join supplier" {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join supplier on ps_suppkey = s_suppkey inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} was: KYLIN-2406 reports an issue: The order of joins will affect the result of query. For example, below query leads to "No model found" Below query triggers NPE {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey inner join supplier on ps_suppkey = s_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} While below query is OK. Only difference being the order of "inner join tmp3" and "inner join supplier" {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join supplier on ps_suppkey = s_suppkey inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} But below query is OK. {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join supplier on ps_suppkey = s_suppkey inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} > Auto adjust join order to make query executable > --- > > Key: KYLIN-2427 > URL: https://issues.apache.org/jira/browse/KYLIN-2427 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu > > KYLIN-2406 reports an issue: The order of joins will affect the result of > query. For example, below query leads to "No model found" > Below query triggers NPE > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01'
[jira] [Created] (KYLIN-2427) Auto adjust join order to make query executable
Kaige Liu created KYLIN-2427: - Summary: Auto adjust join order to make query executable Key: KYLIN-2427 URL: https://issues.apache.org/jira/browse/KYLIN-2427 Project: Kylin Issue Type: Bug Reporter: Kaige Liu KYLIN-2406 reports an issue: The order of joins will affect the result of query. For example, below query leads to "No model found" Below query triggers NPE {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey inner join supplier on ps_suppkey = s_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} While below query is OK. Only difference being the order of "inner join tmp3" and "inner join supplier" {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join supplier on ps_suppkey = s_suppkey inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} But below query is OK. {code} with tmp3 as ( select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey from v_lineitem inner join supplier on l_suppkey = s_suppkey inner join nation on s_nationkey = n_nationkey inner join part on l_partkey = p_partkey where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' and n_name = 'CANADA' and p_name like 'forest%' group by l_partkey, l_suppkey ) select s_name, s_address from v_partsupp inner join supplier on ps_suppkey = s_suppkey inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey where ps_availqty > sum_quantity group by s_name, s_address order by s_name {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KYLIN-2426) Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest
[ https://issues.apache.org/jira/browse/KYLIN-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu updated KYLIN-2426: - Attachment: KYLIN-2426-fix-hardcode.patch Patch attached. Please help review. Thanks. > Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest > > > Key: KYLIN-2426 > URL: https://issues.apache.org/jira/browse/KYLIN-2426 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > Attachments: KYLIN-2426-fix-hardcode.patch > > > There are some hardcodes in ITHDFSResourcesStoreTest which will fail if we > are not running IT in a sandbox. > {code} > public void testFullQalifiedName() throws Exception { > String oldUrl = kylinConfig.getMetadataUrl(); > String path = > "hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/metadata_test2"; > kylinConfig.setProperty("kylin.metadata.url", path + "@hdfs"); > HDFSResourceStore store = new HDFSResourceStore(kylinConfig); > ResourceStoreTest.testAStore(store); > kylinConfig.setProperty("kylin.metadata.url", oldUrl); > assertTrue(fs.exists(new Path(path))); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (KYLIN-2426) Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest
Kaige Liu created KYLIN-2426: - Summary: Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest Key: KYLIN-2426 URL: https://issues.apache.org/jira/browse/KYLIN-2426 Project: Kylin Issue Type: Bug Reporter: Kaige Liu There are some hardcodes in ITHDFSResourcesStoreTest which will fail if we are not running IT in a sandbox. {code} public void testFullQalifiedName() throws Exception { String oldUrl = kylinConfig.getMetadataUrl(); String path = "hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/metadata_test2"; kylinConfig.setProperty("kylin.metadata.url", path + "@hdfs"); HDFSResourceStore store = new HDFSResourceStore(kylinConfig); ResourceStoreTest.testAStore(store); kylinConfig.setProperty("kylin.metadata.url", oldUrl); assertTrue(fs.exists(new Path(path))); } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KYLIN-2426) Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest
[ https://issues.apache.org/jira/browse/KYLIN-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaige Liu reassigned KYLIN-2426: - Assignee: Kaige Liu > Tests will fail if env not satisfy hardcoded path in ITHDFSResourceStoreTest > > > Key: KYLIN-2426 > URL: https://issues.apache.org/jira/browse/KYLIN-2426 > Project: Kylin > Issue Type: Bug >Reporter: Kaige Liu >Assignee: Kaige Liu > > There are some hardcodes in ITHDFSResourcesStoreTest which will fail if we > are not running IT in a sandbox. > {code} > public void testFullQalifiedName() throws Exception { > String oldUrl = kylinConfig.getMetadataUrl(); > String path = > "hdfs://sandbox.hortonworks.com:8020/kylin/kylin_metadata/metadata_test2"; > kylinConfig.setProperty("kylin.metadata.url", path + "@hdfs"); > HDFSResourceStore store = new HDFSResourceStore(kylinConfig); > ResourceStoreTest.testAStore(store); > kylinConfig.setProperty("kylin.metadata.url", oldUrl); > assertTrue(fs.exists(new Path(path))); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KYLIN-2406) TPC-H query 20, can triggers NPE
[ https://issues.apache.org/jira/browse/KYLIN-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851216#comment-15851216 ] Kaige Liu commented on KYLIN-2406: --- Above query will get this execution plan: {code} OLAPToEnumerableConverter EnumerableCalc(expr#0..17=[{inputs}], expr#18=[>($t2, $t9)], S_NAME=[$t12], S_ADDRESS=[$t13], $condition=[$t18]) EnumerableJoin(condition=[=($1, $11)], joinType=[inner]) EnumerableJoin(condition=[AND(=($0, $8), =($1, $10))], joinType=[inner]) *OLAPTableScan(table=[[TPCH_FLAT_ORC_2, V_PARTSUPP]], fields=[[0, 1, 2, 3, 4, 5, 6, 7]])* EnumerableCalc(expr#0..2=[{inputs}], expr#3=[0.5], expr#4=[*($t3, $t2)], L_PARTKEY=[$t0], SUM_QUANTITY=[$t4], L_SUPPKEY=[$t1]) EnumerableAggregate(group=[{0, 1}], agg#0=[SUM($2)]) EnumerableCalc(expr#0..36=[{inputs}], expr#37=['1992-01-01'], expr#38=[>=($t8, $t37)], expr#39=['1995-01-01'], expr#40=[<=($t8, $t39)], expr#41=['CANADA'], expr#42=[=($t28, $t41)], expr#43=['forest%'], expr#44=[LIKE($t31, $t43)], expr#45=[AND($t38, $t40, $t42, $t44)], L_PARTKEY=[$t1], L_SUPPKEY=[$t2], L_QUANTITY=[$t3], $condition=[$t45]) OLAPJoinRel(condition=[=($1, $30)], joinType=[inner]) OLAPJoinRel(condition=[=($23, $27)], joinType=[inner]) OLAPJoinRel(condition=[=($2, $20)], joinType=[inner]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, V_LINEITEM]], fields=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, NATION]], fields=[[0, 1, 2]]) OLAPTableScan(table=[[TPCH_FLAT_ORC_2, PART]], fields=[[0, 1, 2, 3, 4, 5, 6]]) *OLAPTableScan(table=[[TPCH_FLAT_ORC_2, SUPPLIER]], fields=[[0, 1, 2, 3, 4, 5, 6]])* {code} The two OLAPTableScan in same OLAPContext do not match join relation defined in model. Need to give a clear error message here. > TPC-H query 20, can triggers NPE > > > Key: KYLIN-2406 > URL: https://issues.apache.org/jira/browse/KYLIN-2406 > Project: Kylin > Issue Type: Bug >Reporter: liyang >Assignee: Kaige Liu > > Below query triggers NPE > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > inner join supplier on ps_suppkey = s_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} > While below query is OK. Only difference being the order of "inner join tmp3" > and "inner join supplier" > {code} > with tmp3 as ( > select l_partkey, 0.5 * sum(l_quantity) as sum_quantity, l_suppkey > from v_lineitem > inner join supplier on l_suppkey = s_suppkey > inner join nation on s_nationkey = n_nationkey > inner join part on l_partkey = p_partkey > where l_shipdate >= '1992-01-01' and l_shipdate <= '1995-01-01' > and n_name = 'CANADA' > and p_name like 'forest%' > group by l_partkey, l_suppkey > ) > select > s_name, > s_address > from > v_partsupp > inner join supplier on ps_suppkey = s_suppkey > inner join tmp3 on ps_partkey = l_partkey and ps_suppkey = l_suppkey > where > ps_availqty > sum_quantity > group by > s_name, s_address > order by > s_name > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)