[jira] [Created] (KYLIN-4802) “Build N-Dimension Cuboid” execute twice when using DistributedScheduler

2020-10-30 Thread WangSheng (Jira)
WangSheng created KYLIN-4802:


 Summary: “Build N-Dimension Cuboid” execute twice when using 
DistributedScheduler
 Key: KYLIN-4802
 URL: https://issues.apache.org/jira/browse/KYLIN-4802
 Project: Kylin
  Issue Type: Bug
Affects Versions: v2.6.6
Reporter: WangSheng
 Attachments: kylin01.png, kylin02.png, kylin03.png

I met a problem when using DistributedScheduler in two node, my current cluster 
version is 2.6.6. When executing "Build N-Dimension Cuboid : level 4" step, I 
found this step submitted MR job in both two nodes. One node submitted a MR 
first,  and then executed following steps, when executing "Convert Cuboid Data 
to HFile" step, another node submitted a MR for "Build N-Dimension Cuboid : 
level 4" step again. And this caused data missing when generated File. And 
after this job completed, query on level 4 returns empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4060) "Garbage Collection on HDFS" step failed because of hdfs path not exists

2019-06-27 Thread WangSheng (JIRA)
WangSheng created KYLIN-4060:


 Summary: "Garbage Collection on HDFS" step failed because of hdfs 
path not exists
 Key: KYLIN-4060
 URL: https://issues.apache.org/jira/browse/KYLIN-4060
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v2.4.1
Reporter: WangSheng


We found a bug recently when we used streaming cube on last job step "Garbage 
Collection on HDFS", the proplem is as blow:

 
{code:java}
Drop HDFS path on FileSystem: "hdfs://kylin-cluster" 
HDFS path 
/user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test/fact_distinct_columns
 not exists.

File 
/user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test
 does not exist.
{code}
When I check the code and log, I found that the main reason is:

 
 # A build job first submitted, and on step "Update Cube Info", segment became 
"READY";
 # Then a merge job submitted automatically by kylin, include segment on step1. 
The merge job finished quickly, and deleted input segments hdfs path;
 # After merge job finished, the build job continue build, "Hive Cleanup" and 
"Garbage Collection on HBase", failed at last step because the hdfs path is 
deleted on step2.

Our version is 2.4.x, I'm not sure this if this bug fixed on latest 2.6.x 
version. If not, please assign this Jira to me, thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3091) A problem about retention rate analyze

2017-12-08 Thread WangSheng (JIRA)
WangSheng created KYLIN-3091:


 Summary: A problem about retention rate analyze
 Key: KYLIN-3091
 URL: https://issues.apache.org/jira/browse/KYLIN-3091
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.0.0
 Environment: hbase 0.98.8-hadoop2
Reporter: WangSheng
Assignee: liyang


I found that kylin supported retention rate analyze function, so I made some 
test for this function. The following SQL executed successful:
{code:java}
select city, version,
intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday,
intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as 
retention_twoday
from visit_log
where dt in ('2016104', '20161015', '20161016')
group by city, version
{code}
but, other SQLs executed failed like this:
{code:java}
select city,
intersect_count(uuid, dt, array['20161014', '20161015']) as retention_oneday
from visit_log 
where dt in ('2016104', '20161015',) 
group by city, version

select city, version,
intersect_count(uuid, dt, array['20161014', '20161015', '20161016']) as 
retention_twoday
from visit_log 
where dt in ('2016104', '20161015', '20161016') 
group by city, version
{code}
which means I cannot use just one intersect_count UDAF in a SQL, at lease two 
intersect_count. My kylin version is kylin 2.0.0-hbase 0.98.8, and here is the 
error log:
{code:java}
Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.kylin.query.relnode.ColumnRowType.getColumnByIndex(ColumnRowType.java:49)
at 
org.apache.kylin.query.relnode.OLAPAggregateRel.fillbackOptimizedColumn(OLAPAggregateRel.java:396)
at 
org.apache.kylin.query.relnode.OLAPAggregateRel.buildRewriteFieldsAndMetricsColumns(OLAPAggregateRel.java:347)
at 
org.apache.kylin.query.relnode.OLAPAggregateRel.implementRewrite(OLAPAggregateRel.java:283)
at 
org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158)
at 
org.apache.kylin.query.relnode.OLAPLimitRel.implementRewrite(OLAPLimitRel.java:107)
at 
org.apache.kylin.query.relnode.OLAPRel$RewriteImplementor.visitChild(OLAPRel.java:158)
at 
org.apache.kylin.query.relnode.OLAPToEnumerableConverter.implement(OLAPToEnumerableConverter.java:100)
at 
org.apache.calcite.adapter.enumerable.EnumerableRelImplementor.implementRoot(EnumerableRelImplementor.java:108)
at 
org.apache.calcite.adapter.enumerable.EnumerableInterpretable.toBindable(EnumerableInterpretable.java:92)
at 
org.apache.calcite.prepare.CalcitePrepareImpl$CalcitePreparingStmt.implement(CalcitePrepareImpl.java:1248)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:306)
at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:203)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:776)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214)
at 
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:595)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:615)
at 
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:148)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KYLIN-2399) CubeSegmentScanner generated inaccurate

2017-01-15 Thread WangSheng (JIRA)
WangSheng created KYLIN-2399:


 Summary: CubeSegmentScanner generated inaccurate
 Key: KYLIN-2399
 URL: https://issues.apache.org/jira/browse/KYLIN-2399
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: v1.5.4.1
Reporter: WangSheng
Assignee: liyang
 Fix For: Future


My project has three segment:
2016060100_2016060200,
2016060200_2016060300,
2016060300_2016060400

When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02'
Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's 
GTScanRequest are not empty!

When I changed filter condition like this : day>='2016-06-01' and 
day<='2016-06-02'
Kylin would also generated three CubeSegmentScanner, but the last 
CubeSegmentScanner's GTScanRequest is empty!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2398) CubeSegmentScanner generated inaccurate

2017-01-15 Thread WangSheng (JIRA)
WangSheng created KYLIN-2398:


 Summary: CubeSegmentScanner generated inaccurate
 Key: KYLIN-2398
 URL: https://issues.apache.org/jira/browse/KYLIN-2398
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: v1.5.4.1
Reporter: WangSheng
Assignee: liyang
 Fix For: Future


My project has three segment:
2016060100_2016060200,
2016060200_2016060300,
2016060300_2016060400

When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02'
Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's 
GTScanRequest are not empty!

When I changed filter condition like this : day>='2016-06-01' and 
day<='2016-06-02'
Kylin would also generated three CubeSegmentScanner, but the last 
CubeSegmentScanner's GTScanRequest is empty!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2397) CubeSegmentScanner generated inaccurate

2017-01-15 Thread WangSheng (JIRA)
WangSheng created KYLIN-2397:


 Summary: CubeSegmentScanner generated inaccurate
 Key: KYLIN-2397
 URL: https://issues.apache.org/jira/browse/KYLIN-2397
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: v1.5.4.1
Reporter: WangSheng
Assignee: liyang
 Fix For: Future


My project has three segment:
2016060100_2016060200,
2016060200_2016060300,
2016060300_2016060400

When I used filter condition like this : day>='2016-06-01' and day<'2016-06-02'
Kylin would generated three CubeSegmentScanner, and each CubeSegmentScanner's 
GTScanRequest are not empty!

When I changed filter condition like this : day>='2016-06-01' and 
day<='2016-06-02'
Kylin would also generated three CubeSegmentScanner, but the last 
CubeSegmentScanner's GTScanRequest is empty!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2264) Date error when use new streaming cube in Kylin1.6.0

2016-12-08 Thread WangSheng (JIRA)
WangSheng created KYLIN-2264:


 Summary: Date error when use new streaming cube in Kylin1.6.0
 Key: KYLIN-2264
 URL: https://issues.apache.org/jira/browse/KYLIN-2264
 Project: Kylin
  Issue Type: Bug
  Components: streaming, Web 
Affects Versions: v1.6.0
 Environment: Debian 3.2.54-2 x86_64 GNU/Linux
Reporter: WangSheng
Assignee: Zhong,Jason


I installed Kylin1.6.0 and built streaming cube successgfully.But I found two 
problems which I didn't met in Kylin1.5.*.

First, segments' start/end time displayed on Kylin Web are earlier 8 hours than 
my PC date, but streaming cube's Last Build Time and Create Time
displayed on Kylin Web are same with my PC date. Maybe something wrong when 
Kylin Web transform the segments' start/end timestamp into date, but I'm not 
sure.

Second, I did sql query from streaming cube, but the records' time related 
columns like "HOUR_START" and "MINUTE_START" are all earlier 8 hours than my PC 
time. I found that these time related columns' timestamp from HBase are correct 
by remote debug, so I guess something wrong when Kylin server transform these 
timestamp into date.

By the way,  I only changed the "kylin.rest.timezone=GM+8" in file 
"kylin.properties", and my PC date is same with my server date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)