[jira] [Created] (KYLIN-4749) Add isNeedMaterialize() for TableDesc

2020-09-04 Thread Zhong Yanghong (Jira)
Zhong Yanghong created KYLIN-4749:
-

 Summary: Add isNeedMaterialize() for TableDesc
 Key: KYLIN-4749
 URL: https://issues.apache.org/jira/browse/KYLIN-4749
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhong Yanghong
Assignee: Zhong Yanghong






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kylin] RupengWang opened a new pull request #1398: KYLIN-4748 Optimize metadata for debug on local

2020-09-04 Thread GitBox


RupengWang opened a new pull request #1398:
URL: https://github.com/apache/kylin/pull/1398


   ## Proposed changes
   
   Describe the big picture of your changes here to communicate to the 
maintainers why we should accept this pull request. If it fixes a bug or 
resolves a feature request, be sure to link to that issue.
   
   ## Types of changes
   
   What types of changes does your code introduce to Kylin?
   _Put an `x` in the boxes that apply_
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after 
creating the PR. If you're unsure about any of them, don't hesitate to ask. 
We're here to help! This is simply a reminder of what we are going to look for 
before merging your code._
   
   - [ ] I have create an issue on [Kylin's 
jira](https://issues.apache.org/jira/browse/KYLIN), and have described the 
bug/feature there in detail
   - [ ] Commit messages in my PR start with the related jira ID, like 
"KYLIN- Make Kylin project open-source"
   - [ ] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature 
works
   - [ ] If this change need a document change, I will prepare another pr 
against the `document` branch
   - [ ] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
user@kylin or dev@kylin by explaining why you chose the solution you did and 
what alternatives you considered, etc...
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KYLIN-4748) Optimize metadata for debug on local

2020-09-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190632#comment-17190632
 ] 

ASF GitHub Bot commented on KYLIN-4748:
---

RupengWang opened a new pull request #1398:
URL: https://github.com/apache/kylin/pull/1398


   ## Proposed changes
   
   Describe the big picture of your changes here to communicate to the 
maintainers why we should accept this pull request. If it fixes a bug or 
resolves a feature request, be sure to link to that issue.
   
   ## Types of changes
   
   What types of changes does your code introduce to Kylin?
   _Put an `x` in the boxes that apply_
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing 
functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after 
creating the PR. If you're unsure about any of them, don't hesitate to ask. 
We're here to help! This is simply a reminder of what we are going to look for 
before merging your code._
   
   - [ ] I have create an issue on [Kylin's 
jira](https://issues.apache.org/jira/browse/KYLIN), and have described the 
bug/feature there in detail
   - [ ] Commit messages in my PR start with the related jira ID, like 
"KYLIN- Make Kylin project open-source"
   - [ ] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature 
works
   - [ ] If this change need a document change, I will prepare another pr 
against the `document` branch
   - [ ] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
user@kylin or dev@kylin by explaining why you chose the solution you did and 
what alternatives you considered, etc...
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Optimize metadata for debug on local
> 
>
> Key: KYLIN-4748
> URL: https://issues.apache.org/jira/browse/KYLIN-4748
> Project: Kylin
>  Issue Type: Improvement
>Reporter: wangrupeng
>Assignee: wangrupeng
>Priority: Minor
>
> * Add count distinct and percentile measure
>  * Add a new column KYLIN_SALES.ITEM_ID for count distinct
>  * Set SELLER_ID as shard by column 
>  * Add cube configuration 
> *kylin.storage.columnar.shard-countdistinct-rowcount=1000* for file pruner by 
> shard



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kylin] coveralls edited a comment on pull request #1392: KYLIN-4743 NPE when some kafka partition's offset is not exists in checkpoint

2020-09-04 Thread GitBox


coveralls edited a comment on pull request #1392:
URL: https://github.com/apache/kylin/pull/1392#issuecomment-686450407


   ## Pull Request Test Coverage Report for [Build 
6337](https://coveralls.io/builds/33230516)
   
   * **0** of **6**   **(0.0%)**  changed or added relevant lines in **1** file 
are covered.
   * **7** unchanged lines in **4** files lost coverage.
   * Overall coverage increased (+**0.005%**) to **28.059%**
   
   ---
   
   |  Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
   | :-|--||---: |
   | 
[stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/consumer/KafkaConnector.java](https://coveralls.io/builds/33230516/source?filename=stream-source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstream%2Fsource%2Fkafka%2Fconsumer%2FKafkaConnector.java#L87)
 | 0 | 6 | 0.0%
   
   
   |  Files with Coverage Reduction | New Missed Lines | % |
   | :-|--|--: |
   | 
[stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/consumer/KafkaConnector.java](https://coveralls.io/builds/33230516/source?filename=stream-source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstream%2Fsource%2Fkafka%2Fconsumer%2FKafkaConnector.java#L96)
 | 1 | 0% |
   | 
[tool/src/main/java/org/apache/kylin/tool/query/ProbabilityGenerator.java](https://coveralls.io/builds/33230516/source?filename=tool%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Ftool%2Fquery%2FProbabilityGenerator.java#L50)
 | 1 | 78.95% |
   | 
[core-job/src/main/java/org/apache/kylin/job/impl/threadpool/DefaultScheduler.java](https://coveralls.io/builds/33230516/source?filename=core-job%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fjob%2Fimpl%2Fthreadpool%2FDefaultScheduler.java#L194)
 | 2 | 80.23% |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/33230516/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L449)
 | 3 | 78.42% |
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/33230516/badge)](https://coveralls.io/builds/33230516)
 |
   | :-- | --: |
   | Change from base [Build 6322](https://coveralls.io/builds/33179651): |  
0.005% |
   | Covered Lines: | 26247 |
   | Relevant Lines: | 93542 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KYLIN-4743) NPE when some kafka partition's offset is not exists in checkpoint

2020-09-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190629#comment-17190629
 ] 

ASF GitHub Bot commented on KYLIN-4743:
---

coveralls edited a comment on pull request #1392:
URL: https://github.com/apache/kylin/pull/1392#issuecomment-686450407


   ## Pull Request Test Coverage Report for [Build 
6337](https://coveralls.io/builds/33230516)
   
   * **0** of **6**   **(0.0%)**  changed or added relevant lines in **1** file 
are covered.
   * **7** unchanged lines in **4** files lost coverage.
   * Overall coverage increased (+**0.005%**) to **28.059%**
   
   ---
   
   |  Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
   | :-|--||---: |
   | 
[stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/consumer/KafkaConnector.java](https://coveralls.io/builds/33230516/source?filename=stream-source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstream%2Fsource%2Fkafka%2Fconsumer%2FKafkaConnector.java#L87)
 | 0 | 6 | 0.0%
   
   
   |  Files with Coverage Reduction | New Missed Lines | % |
   | :-|--|--: |
   | 
[stream-source-kafka/src/main/java/org/apache/kylin/stream/source/kafka/consumer/KafkaConnector.java](https://coveralls.io/builds/33230516/source?filename=stream-source-kafka%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fstream%2Fsource%2Fkafka%2Fconsumer%2FKafkaConnector.java#L96)
 | 1 | 0% |
   | 
[tool/src/main/java/org/apache/kylin/tool/query/ProbabilityGenerator.java](https://coveralls.io/builds/33230516/source?filename=tool%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Ftool%2Fquery%2FProbabilityGenerator.java#L50)
 | 1 | 78.95% |
   | 
[core-job/src/main/java/org/apache/kylin/job/impl/threadpool/DefaultScheduler.java](https://coveralls.io/builds/33230516/source?filename=core-job%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fjob%2Fimpl%2Fthreadpool%2FDefaultScheduler.java#L194)
 | 2 | 80.23% |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/33230516/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L449)
 | 3 | 78.42% |
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/33230516/badge)](https://coveralls.io/builds/33230516)
 |
   | :-- | --: |
   | Change from base [Build 6322](https://coveralls.io/builds/33179651): |  
0.005% |
   | Covered Lines: | 26247 |
   | Relevant Lines: | 93542 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> NPE when some kafka partition's offset is not exists in checkpoint
> --
>
> Key: KYLIN-4743
> URL: https://issues.apache.org/jira/browse/KYLIN-4743
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0, v3.1.0
> Environment: Centos 7.4.1
> hbase 1.2.4
> hive 2.0.1
> hadoop 2.7.2
>Reporter: GuKe
>Assignee: GuKe
>Priority: Major
> Fix For: Future
>
> Attachments: WX20200903-153117.png
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> We have used streaming cube to statistics data from Kafka.
> If the data partitions of kafka topic is imbalanced (one or more partition is 
> empty),it will cause the cube's checkpoint contains one or more partition's 
> offset is empty.
> For some reason the receiver node be restarted,it will fail to initialization 
> kafka connector because offset is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4748) Optimize metadata for debug on local

2020-09-04 Thread wangrupeng (Jira)
wangrupeng created KYLIN-4748:
-

 Summary: Optimize metadata for debug on local
 Key: KYLIN-4748
 URL: https://issues.apache.org/jira/browse/KYLIN-4748
 Project: Kylin
  Issue Type: Improvement
Reporter: wangrupeng
Assignee: wangrupeng


* Add count distinct and percentile measure
 * Add a new column KYLIN_SALES.ITEM_ID for count distinct
 * Set SELLER_ID as shard by column 
 * Add cube configuration 
*kylin.storage.columnar.shard-countdistinct-rowcount=1000* for file pruner by 
shard



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kylin] hit-lacus merged pull request #1397: KYLIN-4719 Refine kylin-defaults.properties for parquet Storage

2020-09-04 Thread GitBox


hit-lacus merged pull request #1397:
URL: https://github.com/apache/kylin/pull/1397


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (KYLIN-4719) Refine kylin-defaults.properties for parquet Storage

2020-09-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190595#comment-17190595
 ] 

ASF GitHub Bot commented on KYLIN-4719:
---

hit-lacus merged pull request #1397:
URL: https://github.com/apache/kylin/pull/1397


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Refine kylin-defaults.properties for parquet Storage
> 
>
> Key: KYLIN-4719
> URL: https://issues.apache.org/jira/browse/KYLIN-4719
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Environment 
>Reporter: Xiaoxiang Yu
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: v4.0.0-alpha
>
>
> A lot of properties is use less now, we should
>  # remove some of them, such as flink engine, jdbc source
>  # add new properties for new implementation
>  ## sparder context
>  ## global dictionary
>  ## other..
>  
> {code:java}
> // 
>  FLINK ENGINE CONFIGS ###
> #
> ### Flink conf (default is in flink/conf/flink-conf.yaml)
> #kylin.engine.flink-conf.jobmanager.heap.size=2G
> #kylin.engine.flink-conf.taskmanager.heap.size=4G
> #kylin.engine.flink-conf.taskmanager.numberOfTaskSlots=1
> #kylin.engine.flink-conf.taskmanager.memory.preallocate=false
> #kylin.engine.flink-conf.job.parallelism=1
> #kylin.engine.flink-conf.program.enableObjectReuse=false
> #kylin.engine.flink-conf.yarn.queue=
> #kylin.engine.flink-conf.yarn.nodelabel=
> #
>  QUERY PUSH DOWN ###
> #
> ##kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushDownRunnerSparkImpl
> #
> ##kylin.query.pushdown.update-enabled=false
> #
>  JDBC Data Source
> ##kylin.source.jdbc.connection-url=
> ##kylin.source.jdbc.driver=
> ##kylin.source.jdbc.dialect=
> ##kylin.source.jdbc.user=
> ##kylin.source.jdbc.pass=
> ##kylin.source.jdbc.sqoop-home=
> ##kylin.source.jdbc.filed-delimiter=|
> #
>  Livy with Kylin
> ##kylin.engine.livy-conf.livy-enabled=false
> ##kylin.engine.livy-conf.livy-url=http://LivyHost:8998
> ##kylin.engine.livy-conf.livy-key.file=hdfs:///path-to-kylin-job-jar
> ##kylin.engine.livy-conf.livy-arr.jars=hdfs:///path-to-hadoop-dependency-jar
> code placeholder
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4719) Refine kylin-defaults.properties for parquet Storage

2020-09-04 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190594#comment-17190594
 ] 

ASF subversion and git services commented on KYLIN-4719:


Commit 3405ab2609da1fe8e95c5e0fb9603d429c3e81b6 in kylin's branch 
refs/heads/kylin-on-parquet-v2 from Zhichao Zhang
[ https://gitbox.apache.org/repos/asf?p=kylin.git;h=3405ab2 ]

KYLIN-4719 Refine kylin-defaults.properties for parquet Storage


> Refine kylin-defaults.properties for parquet Storage
> 
>
> Key: KYLIN-4719
> URL: https://issues.apache.org/jira/browse/KYLIN-4719
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Environment 
>Reporter: Xiaoxiang Yu
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: v4.0.0-alpha
>
>
> A lot of properties is use less now, we should
>  # remove some of them, such as flink engine, jdbc source
>  # add new properties for new implementation
>  ## sparder context
>  ## global dictionary
>  ## other..
>  
> {code:java}
> // 
>  FLINK ENGINE CONFIGS ###
> #
> ### Flink conf (default is in flink/conf/flink-conf.yaml)
> #kylin.engine.flink-conf.jobmanager.heap.size=2G
> #kylin.engine.flink-conf.taskmanager.heap.size=4G
> #kylin.engine.flink-conf.taskmanager.numberOfTaskSlots=1
> #kylin.engine.flink-conf.taskmanager.memory.preallocate=false
> #kylin.engine.flink-conf.job.parallelism=1
> #kylin.engine.flink-conf.program.enableObjectReuse=false
> #kylin.engine.flink-conf.yarn.queue=
> #kylin.engine.flink-conf.yarn.nodelabel=
> #
>  QUERY PUSH DOWN ###
> #
> ##kylin.query.pushdown.runner-class-name=org.apache.kylin.query.pushdown.PushDownRunnerSparkImpl
> #
> ##kylin.query.pushdown.update-enabled=false
> #
>  JDBC Data Source
> ##kylin.source.jdbc.connection-url=
> ##kylin.source.jdbc.driver=
> ##kylin.source.jdbc.dialect=
> ##kylin.source.jdbc.user=
> ##kylin.source.jdbc.pass=
> ##kylin.source.jdbc.sqoop-home=
> ##kylin.source.jdbc.filed-delimiter=|
> #
>  Livy with Kylin
> ##kylin.engine.livy-conf.livy-enabled=false
> ##kylin.engine.livy-conf.livy-url=http://LivyHost:8998
> ##kylin.engine.livy-conf.livy-key.file=hdfs:///path-to-kylin-job-jar
> ##kylin.engine.livy-conf.livy-arr.jars=hdfs:///path-to-hadoop-dependency-jar
> code placeholder
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4540) Some field values will become null after UNION ALL

2020-09-04 Thread Yaqian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yaqian Zhang updated KYLIN-4540:

Fix Version/s: (was: v3.1.1)
   Future

> Some field values will become null after UNION ALL
> --
>
> Key: KYLIN-4540
> URL: https://issues.apache.org/jira/browse/KYLIN-4540
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v3.0.2
>Reporter: Yaqian Zhang
>Assignee: Yaqian Zhang
>Priority: Major
> Fix For: Future
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> It can be reproduced in learn_kylin.
> see: 
> https://lists.apache.org/thread.html/r5b17ec62b08dbfb82e1db597e957a51ace210d95a6e1ba2b217496df%40%3Cuser.kylin.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)