[jira] [Created] (KYLIN-4802) “Build N-Dimension Cuboid” execute twice when using DistributedScheduler
WangSheng created KYLIN-4802: Summary: “Build N-Dimension Cuboid” execute twice when using DistributedScheduler Key: KYLIN-4802 URL: https://issues.apache.org/jira/browse/KYLIN-4802 Project: Kylin Issue Type: Bug Affects Versions: v2.6.6 Reporter: WangSheng Attachments: kylin01.png, kylin02.png, kylin03.png I met a problem when using DistributedScheduler in two node, my current cluster version is 2.6.6. When executing "Build N-Dimension Cuboid : level 4" step, I found this step submitted MR job in both two nodes. One node submitted a MR first, and then executed following steps, when executing "Convert Cuboid Data to HFile" step, another node submitted a MR for "Build N-Dimension Cuboid : level 4" step again. And this caused data missing when generated File. And after this job completed, query on level 4 returns empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KYLIN-4060) "Garbage Collection on HDFS" step failed because of hdfs path not exists
WangSheng created KYLIN-4060: Summary: "Garbage Collection on HDFS" step failed because of hdfs path not exists Key: KYLIN-4060 URL: https://issues.apache.org/jira/browse/KYLIN-4060 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: v2.4.1 Reporter: WangSheng We found a bug recently when we used streaming cube on last job step "Garbage Collection on HDFS", the proplem is as blow: {code:java} Drop HDFS path on FileSystem: "hdfs://kylin-cluster" HDFS path /user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test/fact_distinct_columns not exists. File /user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test does not exist. {code} When I check the code and log, I found that the main reason is: # A build job first submitted, and on step "Update Cube Info", segment became "READY"; # Then a merge job submitted automatically by kylin, include segment on step1. The merge job finished quickly, and deleted input segments hdfs path; # After merge job finished, the build job continue build, "Hive Cleanup" and "Garbage Collection on HBase", failed at last step because the hdfs path is deleted on step2. Our version is 2.4.x, I'm not sure this if this bug fixed on latest 2.6.x version. If not, please assign this Jira to me, thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KYLIN-3091) A problem about retention rate analyze
WangSheng created KYLIN-3091: Summary: A problem about retention rate analyze Key: KYLIN-3091 URL: https://issues.apache.org/jira/browse/KYLIN-3091 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: v2.0.0 Environment: hbase 0.98.8-hadoop2 Reporter: WangSheng Assignee: liyang I found that kylin supported retention rate analyze function, so I made some test for this function. The following SQL executed successful: {code:java} select city, version, intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday, intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as retention_twoday from visit_log where dt in ('2016104', '20161015', '20161016') group by city, version {code} but, other SQLs executed failed like this: {code:java} select city, intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday from visit_log where dt in ('2016104', '20161015',) group by city, version select city, version, intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as retention_twoday from visit_log where dt in ('2016104', '20161015', '20161016') group by city, version {code} which means I cannot use just one intersect_count UDAF in a SQL, at lease two intersect_count. My kylin version is kylin 2.0.0-hbase 0.98.8, and here is the error log: {code:java} Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.kylin.query.relnode.ColumnRowType.getColumnByIndex(ColumnRowType.java:49) at org.apache.kylin.query.relnode.OLAPAggregateRel.fillbackOptimizedColumn(OLAPAggregateRel.java:396) at org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:347) at org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:283) at org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) at org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:107) at org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158) at org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:100) at org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108) at org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92) at org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1248) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:306) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:203) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:776) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214) at org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:595) at org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615) at org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KYLIN-2399) CubeSegmentScanner generated inaccurate
WangSheng created KYLIN-2399: Summary: CubeSegmentScanner generated inaccurate Key: KYLIN-2399 URL: https://issues.apache.org/jira/browse/KYLIN-2399 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: v1.5.4.1 Reporter: WangSheng Assignee: liyang Fix For: Future My project has three segment: 2016060100_2016060200, 2016060200_2016060300, 2016060300_2016060400 When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02' Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's GTScanRequest are not empty! When I changed filter condition like this : day>='2016-06-01' and day<='2016-06-02' Kylin would also generated three CubeSegmentScanner, but the last CubeSegmentScanner's GTScanRequest is empty! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2398) CubeSegmentScanner generated inaccurate
WangSheng created KYLIN-2398: Summary: CubeSegmentScanner generated inaccurate Key: KYLIN-2398 URL: https://issues.apache.org/jira/browse/KYLIN-2398 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: v1.5.4.1 Reporter: WangSheng Assignee: liyang Fix For: Future My project has three segment: 2016060100_2016060200, 2016060200_2016060300, 2016060300_2016060400 When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02' Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's GTScanRequest are not empty! When I changed filter condition like this : day>='2016-06-01' and day<='2016-06-02' Kylin would also generated three CubeSegmentScanner, but the last CubeSegmentScanner's GTScanRequest is empty! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2397) CubeSegmentScanner generated inaccurate
WangSheng created KYLIN-2397: Summary: CubeSegmentScanner generated inaccurate Key: KYLIN-2397 URL: https://issues.apache.org/jira/browse/KYLIN-2397 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: v1.5.4.1 Reporter: WangSheng Assignee: liyang Fix For: Future My project has three segment: 2016060100_2016060200, 2016060200_2016060300, 2016060300_2016060400 When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02' Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's GTScanRequest are not empty! When I changed filter condition like this : day>='2016-06-01' and day<='2016-06-02' Kylin would also generated three CubeSegmentScanner, but the last CubeSegmentScanner's GTScanRequest is empty! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2264) Date error when use new streaming cube in Kylin1.6.0
WangSheng created KYLIN-2264: Summary: Date error when use new streaming cube in Kylin1.6.0 Key: KYLIN-2264 URL: https://issues.apache.org/jira/browse/KYLIN-2264 Project: Kylin Issue Type: Bug Components: streaming, Web Affects Versions: v1.6.0 Environment: Debian 3.2.54-2 x86_64 GNU/Linux Reporter: WangSheng Assignee: Zhong,Jason I installed Kylin1.6.0 and built streaming cube successgfully.But I found two problems which I didn't met in Kylin1.5.*. First, segments' start/end time displayed on Kylin Web are earlier 8 hours than my PC date, but streaming cube's Last Build Time and Create Time displayed on Kylin Web are same with my PC date. Maybe something wrong when Kylin Web transform the segments' start/end timestamp into date, but I'm not sure. Second, I did sql query from streaming cube, but the records' time related columns like "HOUR_START" and "MINUTE_START" are all earlier 8 hours than my PC time. I found that these time related columns' timestamp from HBase are correct by remote debug, so I guess something wrong when Kylin server transform these timestamp into date. By the way, I only changed the "kylin.rest.timezone=GM+8" in file "kylin.properties", and my PC date is same with my server date. -- This message was sent by Atlassian JIRA (v6.3.4#6332)