[jira] [Updated] (KYLIN-2731) Introduce checkpoint executable

2017-08-24 Thread Zhong Yanghong (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-2731:
--
Affects Version/s: (was: v2.0.0)
   v2.2.0

> Introduce checkpoint executable
> ---
>
> Key: KYLIN-2731
> URL: https://issues.apache.org/jira/browse/KYLIN-2731
> Project: Kylin
>  Issue Type: Sub-task
>Affects Versions: v2.2.0
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KYLIN-2732) Introduce base cuboid as a new input for cubing job

2017-08-24 Thread Zhong Yanghong (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-2732:
--
Affects Version/s: (was: v2.0.0)
   v2.2.0

> Introduce base cuboid as a new input for cubing job
> ---
>
> Key: KYLIN-2732
> URL: https://issues.apache.org/jira/browse/KYLIN-2732
> Project: Kylin
>  Issue Type: Sub-task
>Affects Versions: v2.2.0
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-2731) Introduce checkpoint executable

2017-08-24 Thread Zhong Yanghong (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16141110#comment-16141110
 ] 

Zhong Yanghong commented on KYLIN-2731:
---

https://github.com/apache/kylin/commit/7603a5aafa64e990e3f8a40c267f835d274aef61

> Introduce checkpoint executable
> ---
>
> Key: KYLIN-2731
> URL: https://issues.apache.org/jira/browse/KYLIN-2731
> Project: Kylin
>  Issue Type: Sub-task
>Affects Versions: v2.2.0
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KYLIN-2794) MultipleDictionaryValueEnumerator should output values in sorted order

2017-08-24 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/KYLIN-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

翟玉勇 updated KYLIN-2794:
---
Priority: Critical  (was: Minor)

> MultipleDictionaryValueEnumerator should output values in sorted order
> --
>
> Key: KYLIN-2794
> URL: https://issues.apache.org/jira/browse/KYLIN-2794
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.0.0
> Environment: hadoop hadoop-2.6.0-cdh5.8.2   hive 2.1 hbase 0.98
>Reporter: 翟玉勇
>Assignee: Dong Li
>Priority: Critical
>
> {code}
> 2017-08-18 14:17:48,828 ERROR [pool-11-thread-1] 
> threadpool.DistributedScheduler:188 : ExecuteException 
> job:8d031b5f-2d3f-445f-a62b-7bc560d919ea in server: **
> org.apache.kylin.job.exception.ExecuteException: 
> org.apache.kylin.job.exception.ExecuteException: 
> java.lang.IllegalStateException: Invalid input data. Unordered data cannot be 
> split into multi trees
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:134)
>   at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:185)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kylin.job.exception.ExecuteException: 
> java.lang.IllegalStateException: Invalid input data. Unordered data cannot be 
> split into multi trees
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:134)
>   at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
>   ... 4 more
> Caused by: java.lang.IllegalStateException: Invalid input data. Unordered 
> data cannot be split into multi trees
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(TrieDictionaryForestBuilder.java:92)
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.addValue(TrieDictionaryForestBuilder.java:78)
>   at 
> org.apache.kylin.dict.DictionaryGenerator$StringTrieDictForestBuilder.addValue(DictionaryGenerator.java:212)
>   at 
> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:79)
>   at 
> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:64)
>   at 
> org.apache.kylin.dict.DictionaryGenerator.mergeDictionaries(DictionaryGenerator.java:104)
>   at 
> org.apache.kylin.dict.DictionaryManager.mergeDictionary(DictionaryManager.java:267)
>   at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.mergeDictionaries(MergeDictionaryStep.java:146)
>   at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.makeDictForNewSegment(MergeDictionaryStep.java:136)
>   at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryStep.doWork(MergeDictionaryStep.java:68)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:124)
>   ... 6 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KYLIN-2813) kylin jdbc with mybatis

2017-08-24 Thread JunQiangZhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JunQiangZhang updated KYLIN-2813:
-
Description: 
maping  file:



AND a.username=#{userName}


AND a.apmtype=#{apmType}


AND a.os=#{os}


AND a.appversion={appVersion}

ORDER BY a.recordtime DESC



The rowCount and offSet is replace by params. when i query use this.
It cause:
2017-08-25 11:09:14,739 ERROR [Query c31e532a-e6c6-4af6-b5c8-7015a6082326-526] 
service.QueryService:382 : Exception when execute sql
java.sql.SQLException: Error while preparing statement [SELECT a.username, 
a.apmtype, a.appversion,
  a.os, a.recordtime
  FROM HOLMES.SRC_LOG_YANXUAN_APM AS a
  WHERE a.recorddate=?
  AND a.appid='11'
 
 
 
AND a.apmtype=?
 
 
AND a.os=?
 
 
ORDER BY a.recordtime DESC
 
  LIMIT ?
  OFFSET ?]
at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:203)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:185)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:86)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:165)
at 
org.apache.kylin.rest.service.QueryService.execute(QueryService.java:551)
at 
org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:466)
at 
org.apache.kylin.rest.service.QueryService.query(QueryService.java:153)
at 
org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:357)
at 
org.apache.kylin.rest.controller.QueryController.prepareQuery(QueryController.java:76)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: parse failed: Encountered "LIMIT ?" at 
line 17, column 11.
Was expecting one of:
 
"LIMIT"  ...
"LIMIT"  ...
"LIMIT"  ...
"LIMIT" "ALL" ...
"OFFSET" ...
"FETCH" ...
"," ...
"NULLS" ...

at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:750)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:196)
... 79 more
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "LIMIT 
?" at line 17, column 11.
Was expecting one of:
 
... 83 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered 
"LIMIT ?" at line 17, column 11.
Was expecting one of:
 
"LIMIT"  ...
"LIMIT"  ...
"LIMIT"  ...
"LIMIT" "ALL" ...
"OFFSET" ...
"FETCH" ...
"," ...
"NULLS" ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:21455)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:21278)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:843)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:185)

It seems that cannot pass the QueryService.execute conn.prepareStatement.

If I set the LIMIT and offset with a constant value. it will pass.


  was:
maping  file:



AND a.username=#{userName}


AND a.apmtype=#{apmType}


AND a.os=#{os}


AND a.appversion={appVersion}

ORDER BY a.recordtime DESC



The rowCount and offSet is replace by params. when i query use this.
It cause:
2017-08-25 11:09:14,739 ERROR [Query c31e532a-e6c6-4af6-b5c8-7015a6082326-526] 
service.QueryService:382 : Exception when execute sql
java.sql.SQLException: Error while preparing statement [SELECT a.username, 
a.apmtype, a.appversion,
  a.os, a.recordtime
  FROM HOLMES.SRC_LOG_YANXUAN_APM AS a
  WHERE a.recorddate=?
  AND a.appid='11'
 
 
 
AND a.apmtype=?
 
 
AND a.os=?
 
 
ORDER BY a.recordtime DESC
 
  LIMIT ?
  OFFSET ?]
at 

[jira] [Created] (KYLIN-2813) kylin jdbc with mybatis

2017-08-24 Thread JunQiangZhang (JIRA)
JunQiangZhang created KYLIN-2813:


 Summary: kylin jdbc with mybatis
 Key: KYLIN-2813
 URL: https://issues.apache.org/jira/browse/KYLIN-2813
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.0.0
 Environment: kylin version:2.0.0
org.mybatis.spring.boot:1.3.0
Reporter: JunQiangZhang
Assignee: liyang


maping  file:



AND a.username=#{userName}


AND a.apmtype=#{apmType}


AND a.os=#{os}


AND a.appversion={appVersion}

ORDER BY a.recordtime DESC



The rowCount and offSet is replace by params. when i query use this.
It cause:
2017-08-25 11:09:14,739 ERROR [Query c31e532a-e6c6-4af6-b5c8-7015a6082326-526] 
service.QueryService:382 : Exception when execute sql
java.sql.SQLException: Error while preparing statement [SELECT a.username, 
a.apmtype, a.appversion,
  a.os, a.recordtime
  FROM HOLMES.SRC_LOG_YANXUAN_APM AS a
  WHERE a.recorddate=?
  AND a.appid='11'
 
 
 
AND a.apmtype=?
 
 
AND a.os=?
 
 
ORDER BY a.recordtime DESC
 
  LIMIT ?
  OFFSET ?]
at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:203)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:185)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:86)
at 
org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:165)
at 
org.apache.kylin.rest.service.QueryService.execute(QueryService.java:551)
at 
org.apache.kylin.rest.service.QueryService.queryWithSqlMassage(QueryService.java:466)
at 
org.apache.kylin.rest.service.QueryService.query(QueryService.java:153)
at 
org.apache.kylin.rest.service.QueryService.doQueryWithCache(QueryService.java:357)
at 
org.apache.kylin.rest.controller.QueryController.prepareQuery(QueryController.java:76)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: parse failed: Encountered "LIMIT ?" at 
line 17, column 11.
Was expecting one of:
 
"LIMIT"  ...
"LIMIT"  ...
"LIMIT"  ...
"LIMIT" "ALL" ...
"OFFSET" ...
"FETCH" ...
"," ...
"NULLS" ...

at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:750)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:632)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:602)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214)
at 
org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:196)
... 79 more
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "LIMIT 
?" at line 17, column 11.
Was expecting one of:
 
... 83 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered 
"LIMIT ?" at line 17, column 11.
Was expecting one of:
 
"LIMIT"  ...
"LIMIT"  ...
"LIMIT"  ...
"LIMIT" "ALL" ...
"OFFSET" ...
"FETCH" ...
"," ...
"NULLS" ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:21455)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:21278)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:843)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:185)

It seems that cannot pass the QueryService.execute conn.prepareStatement.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KYLIN-2811) Refine Spark Cubing to reduce serialization's overhead

2017-08-24 Thread Wang Cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Cheng reassigned KYLIN-2811:
-

Assignee: Wang Cheng

> Refine Spark Cubing to reduce serialization's overhead
> --
>
> Key: KYLIN-2811
> URL: https://issues.apache.org/jira/browse/KYLIN-2811
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Wang Cheng
>Assignee: Wang Cheng
>Priority: Minor
>
> In Spark Cubing, a lot of variables defined in driver and used in closures, 
> which cause extra serialization's overhead.
> Meanwhile, remove the method of reading KylinConfig from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KYLIN-2812) Save to wrong database when loading Kafka Topic

2017-08-24 Thread Billy Liu (JIRA)
Billy Liu created KYLIN-2812:


 Summary: Save to wrong database when loading Kafka Topic 
 Key: KYLIN-2812
 URL: https://issues.apache.org/jira/browse/KYLIN-2812
 Project: Kylin
  Issue Type: Bug
Reporter: Billy Liu
Assignee: Billy Liu
Priority: Minor
 Fix For: v2.2.0


When loading Kafka Topic, user could select the destination database name. 
Currently, all topics are saved into DEFAULT database, which is not expected. 
Here is a bug during reading create table parameter. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KYLIN-2811) Refine Spark Cubing to reduce serialization's overhead

2017-08-24 Thread Wang Cheng (JIRA)
Wang Cheng created KYLIN-2811:
-

 Summary: Refine Spark Cubing to reduce serialization's overhead
 Key: KYLIN-2811
 URL: https://issues.apache.org/jira/browse/KYLIN-2811
 Project: Kylin
  Issue Type: Improvement
Reporter: Wang Cheng
Priority: Minor


In Spark Cubing, a lot of variables defined in driver and used in closures, 
which cause extra serialization's overhead.

Meanwhile, remove the method of reading KylinConfig from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-2286) global snapshot table for one cube

2017-08-24 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139845#comment-16139845
 ] 

Billy Liu commented on KYLIN-2286:
--

I think this is Slowly Changing Dimension(SCD) topic. For more info, check 
http://datawarehouse4u.info/SCD-Slowly-Changing-Dimensions.html
The issue becomes more complicated when dealing with normal dimension and 
derived dimension in lookup table. 
 
To support Type 0 - The passive method, use normal dimension in lookup table.
To support Type 1 - Overwriting the old value, use derived dimension in lookup 
table.

To support Type 0 with derived dimension will use global snapshot table like 
this issue. 
This feature is welcomed, to support more flexible Kylin. 

> global snapshot table for one cube 
> ---
>
> Key: KYLIN-2286
> URL: https://issues.apache.org/jira/browse/KYLIN-2286
> Project: Kylin
>  Issue Type: Improvement
>Reporter: fengYu
>Assignee: fengYu
>
> I current version, Kylin build a snapshot table for a segment and isolate 
> with each other in the same cube,  even though some segments share the same 
> snapshot table storage  .
> I some scene, we need global snapshot table for one cube, such as we has a 
> cube with snapshot table,ID is PK,the first day, the table look like:
> id name
> 1   A
> 2   B
> 3   C
> the query 'select name, count(1) from fact join dimension group by name' get 
> result:
> A xx
> B xx
> C xx
> the next day(segment), lookup table modified, it looks like :
> id name
> 1   A
> 2   D
> 3   E
> the same query return :
> A xx
> B xx
> C xx
> D xx
> E xx
> However B and D, C and E has the same ID, we need the newest result. so a 
> global snapshot table shared by all segments which has always the newest 
> values is needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-2810) Kylin UDF support

2017-08-24 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139740#comment-16139740
 ] 

Billy Liu commented on KYLIN-2810:
--

Thanks [~feng_xiao_yu], the UDF extensions are highly welcomed. 

> Kylin UDF support
> -
>
> Key: KYLIN-2810
> URL: https://issues.apache.org/jira/browse/KYLIN-2810
> Project: Kylin
>  Issue Type: New Feature
>Reporter: fengYu
>Assignee: fengYu
>
> Kylin do not support some function calcite do not support, May I contribute 
> some UDF in kylin, In this way, some of our BI tools can use kylin everywhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KYLIN-2810) Kylin UDF support

2017-08-24 Thread Billy Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billy Liu reassigned KYLIN-2810:


  Assignee: fengYu
Issue Type: New Feature  (was: Bug)

> Kylin UDF support
> -
>
> Key: KYLIN-2810
> URL: https://issues.apache.org/jira/browse/KYLIN-2810
> Project: Kylin
>  Issue Type: New Feature
>Reporter: fengYu
>Assignee: fengYu
>
> Kylin do not support some function calcite do not support, May I contribute 
> some UDF in kylin, In this way, some of our BI tools can use kylin everywhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KYLIN-2810) Kylin UDF support

2017-08-24 Thread fengYu (JIRA)
fengYu created KYLIN-2810:
-

 Summary: Kylin UDF support
 Key: KYLIN-2810
 URL: https://issues.apache.org/jira/browse/KYLIN-2810
 Project: Kylin
  Issue Type: Bug
Reporter: fengYu


Kylin do not support some function calcite do not support, May I contribute 
some UDF in kylin, In this way, some of our BI tools can use kylin everywhere.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-2286) global snapshot table for one cube

2017-08-24 Thread fengYu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139716#comment-16139716
 ] 

fengYu commented on KYLIN-2286:
---

What do you think about the feature, If it meet some demand, I can share our 
implements. Thanks a lot.

> global snapshot table for one cube 
> ---
>
> Key: KYLIN-2286
> URL: https://issues.apache.org/jira/browse/KYLIN-2286
> Project: Kylin
>  Issue Type: Improvement
>Reporter: fengYu
>Assignee: fengYu
>
> I current version, Kylin build a snapshot table for a segment and isolate 
> with each other in the same cube,  even though some segments share the same 
> snapshot table storage  .
> I some scene, we need global snapshot table for one cube, such as we has a 
> cube with snapshot table,ID is PK,the first day, the table look like:
> id name
> 1   A
> 2   B
> 3   C
> the query 'select name, count(1) from fact join dimension group by name' get 
> result:
> A xx
> B xx
> C xx
> the next day(segment), lookup table modified, it looks like :
> id name
> 1   A
> 2   D
> 3   E
> the same query return :
> A xx
> B xx
> C xx
> D xx
> E xx
> However B and D, C and E has the same ID, we need the newest result. so a 
> global snapshot table shared by all segments which has always the newest 
> values is needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (KYLIN-1890) support hbase table prefix configurable

2017-08-24 Thread fengYu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fengYu closed KYLIN-1890.
-
Resolution: Won't Fix

> support hbase table prefix configurable
> ---
>
> Key: KYLIN-1890
> URL: https://issues.apache.org/jira/browse/KYLIN-1890
> Project: Kylin
>  Issue Type: Improvement
>  Components: General
>Affects Versions: v1.5.2
>Reporter: fengYu
>Assignee: fengYu
> Attachments: 
> 0001-KYLIN-1890-support-hbase-table-prefix-configurable.patch
>
>
> some times we need deploy two kylin env based on same hbase, I want to change 
> hbase table name prefix based two reasons:
> 1、different kylin env will generate the same table name
> 2、while clean invalid htable for one env will cause delete all tables belong 
> to another env.
> different kylin env use different namespace is acceptable either.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (KYLIN-1172) kylin support multi-hive on different hadoop cluster

2017-08-24 Thread fengYu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fengYu closed KYLIN-1172.
-
Resolution: Won't Fix

> kylin support multi-hive on different hadoop cluster
> 
>
> Key: KYLIN-1172
> URL: https://issues.apache.org/jira/browse/KYLIN-1172
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v1.0
>Reporter: fengYu
>Assignee: fengYu
> Attachments: 0001-kerberos.patch, 
> 0001-support-more-hives-depend-on-different-hadoop-add-co.patch, 
> 0002-hadoop-jar-files.patch, 
> 0003-git-common-package-part-patch-KYLIN-1172.patch, 
> 0004-git-cube-package-part-patch-KYLIN-1172.patch, 
> 0005-git-metadata-package-part-patch-KYLIN-1172.patch, 
> 0006-git-server-package-part-patch-KYLIN-1172.patch, 
> 0007-dictionary-package-part-patch-KYLIN-1172.patch, 
> 0008-job-package-part-patch-KYLIN-1172.patch
>
>
> Hi, I recently modify kylin to support multi-hive on different hadoop 
> cluster and take them as input source to kylin, we do this since the 
> following reasons:
> 1、we have more than one hadoop cluster and many hive depend on them(products 
> may has its own hive), we cannot migrate those hives to one and don't want to 
> deploy one kylin for every hive source. 
> 2、our hadoop cluster deploy in different DC, we need to support them in one 
> kylin instance.
> 3、source data in hive is much less than hfile, so copy those files cross 
> different different is more efficient(fact distinct column job and base 
> cuboid job need take datas at hive as input), so we deploy hbase and hadoop 
> in one DC (separated in different HDFS).
> So, we divide data flow into 3 parts, hive is input source, hadoop do 
> computing which will generate many temporary files, hbase is output. After 
> cube building, queries on kylin just interactive with hbase. therefore, what 
> we need to do is how to build cube base on differnet hives and hadoops.
> Our method are summarized below :
> 1、Deploy hive and hadoops, before start kylin, user should deploy all hives 
> and hadoop, and ensure you can run hive sql in ./hive. and access every HDFS 
> with 'hadoop  fs  'command(add more nameservice in hdfs-site.xml).
> 2、Divide hives into two part: the hive that used when kylin start(we call it 
> default one) and others are additional, we should allocate a name for every 
> hive (default one is null), For simplicity, we just add a config property 
> that tells root directory of all hive client, and every hive client is a 
> directory whose name is the hive name(default one do not need locate in).  
> 3、Attach only a hive to one project , so when creating a project, you should 
> specify a hive name, and according to it we can find the hive client(include 
> hive command and config files).
> 4、when load table in one project, find the hive-site.xml and create a 
> HiveClient using this config file.
> 5、can not take HCatInputFormat as inputFormat in FactDistinctColumnsJob, so 
> we change the job and take the intermediate hive table location as input file 
> and change FactDistinctColumnsMapper. HiveColumnCardinalityJob will fail if 
> we use additional hive.
> 6、Because we need to run MR in one hadoop cluster and input or output located 
>  at other HDFS, so when we set input location to real name node address 
> instead of name service(this is a config property too).
> That is all we do, I think it can make things easy to manage more 
> than one hives and hadoops. we have apply it in our env and it works well. I 
> hope it can help other people... 
> patch uploaded, illustrations:
> 1、add two config property, 
> 2、add hivename to projectInstance and make projectName in cube persistence in 
> hbase.
> 3、create HiveClient with a hive-site.xml file or use default one that in 
> kylin classpath
> 4、modify two hadoop job: FactDistinctColumnsJob and CuboidJob, take 
> Intermediate  table name as input and change to table location in run()
> 5、transform nameservice to master name node while access data located in 
> other hadoop if necessary.
> the patch is based on 1.0-incubating and we add patchs KYLIN-1014、KYLIN-1021 
> and KYLIN-957 in order ..



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KYLIN-2804) Unchecked return value from Input#read() in PercentileCounterSerializer

2017-08-24 Thread zhengdong (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139705#comment-16139705
 ] 

zhengdong commented on KYLIN-2804:
--

Not surprisingly, input.read(buffer) method couldn't guarantee the data length 
exactly equal to the buffer length. 
However, the byte array underlying the input instance was filled in 
PercentileCounterSerializer.write method and its length should be equal to the 
buffer length.

> Unchecked return value from Input#read() in PercentileCounterSerializer
> ---
>
> Key: KYLIN-2804
> URL: https://issues.apache.org/jira/browse/KYLIN-2804
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: zhengdong
>Priority: Minor
>
> {code}
> byte[] buffer = new byte[length];
> input.read(buffer);
> {code}
> length bytes are allocated.
> The return value from read() should be checked against length.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KYLIN-2809) Support operator "+" as string concat operator

2017-08-24 Thread Roger Shi (JIRA)
Roger Shi created KYLIN-2809:


 Summary: Support operator "+" as string concat operator
 Key: KYLIN-2809
 URL: https://issues.apache.org/jira/browse/KYLIN-2809
 Project: Kylin
  Issue Type: Improvement
Reporter: Roger Shi


Tableau only support "+" as string concat operator. Support it will improve 
tableau compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)