from:"nichunen \(JIRA\)"

[jira] [Commented] (KYLIN-4481) Project-level ACL lookups not working for non-admin SAML-federated users

2020-07-31 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168993#comment-17168993
 ] 

nichunen commented on KYLIN-4481:
-

This change is merged by me, and now I agree with Yanghong, I suggest reverting 
the change, and fix the  issue with a more reasonable way. 

> Project-level ACL lookups not working for non-admin SAML-federated users
> 
>
> Key: KYLIN-4481
> URL: https://issues.apache.org/jira/browse/KYLIN-4481
> Project: Kylin
>  Issue Type: Bug
>  Components: Security
>Affects Versions: v2.6.5, v3.0.1
>Reporter: Rafael Felix Correa
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: v3.1.0, v3.0.2, v2.6.6
>
>
> Steps to reproduce:
>  * setup kylin with SAML as described in 
> [http://kylin.apache.org/docs/howto/howto_ldap_and_sso.html]. 
> kylin.properties:
> {code:java}
> kylin.security.profile=saml
> kylin.security.acl.admin-role=Kylin_Admins
> kylin.security.ldap.connection-server=ldap://openldap:389
> kylin.security.ldap.connection-username=cn=admin,dc=example,dc=org
> # set kylin.security.ldap.connection-password appropriately
> kylin.security.ldap.user-search-base=ou=people,dc=example,dc=org
> kylin.security.ldap.user-search-pattern=(uid={0})
> kylin.security.ldap.user-group-search-base=ou=groups,dc=example,dc=org
> kylin.security.saml.context-context-path=/kylin
> kylin.security.saml.context-scheme=https
> kylin.security.saml.context-server-name=kylin.validdomain.com
> kylin.security.saml.context-server-port=443
> kylin.security.saml.metadata-entity-base-url=https://kylin.validdomain.com/kylin{code}
>  * on the LDAP server, make sure you have the following objects in place: 
> {code:java}
> # example.user, people, example.org
> dn: uid=example.user,ou=people,dc=example,dc=org
> objectClass: top
> objectClass: account
> objectClass: posixAccount
> objectClass: shadowAccount
> gidNumber: 1
> uidNumber: 5000
> cn: Does not matter
> homeDirectory: /home/doesntmatter
> uid: example.user{code}
>  * 
> {code:java}
> # Kylin_Users, groups, example.org
> dn: cn=Kylin_Users,ou=groups,dc=example,dc=org
> objectClass: top
> objectClass: groupOfNames
> cn: Kylin_Users
> member: uid=example.user,ou=people,dc=example,dc=org{code}
>  * as an ADMIN, create a sample project in kylin and grant QUERY, MANAGEMENT 
> or OPERATION access to example.user.
>  * now, try logging into kylin.validdomain.com's Web UI as 
> [example.u...@validdomain.com.|mailto:example.u...@validdomain.com.]
> Expected result:
>  * example.user is logged in, able to select the project from the dropdown 
> box at the top left corner and navigate through its properties.
> Actual result:
>  * example.user is logged in, but no projects are listed in the dropdown box. 
> As if he/she had no permissions in any project.
>  
> With LDAP-pure installations (no SAML), this configuration works as expected.
>  
> Worth noting: 
> [https://github.com/apache/kylin/blob/kylin-3.0.1/server-base/src/main/java/org/apache/kylin/rest/security/SAMLUserDetailsService.java#L40-L54]
>  splits the user in the '@' char for performing LDAP lookups. However, by 
> editing kylin_metadata manually and appending the @validdomain.com to the 
> corresponding object under /acls, the lookup works as it should and the 
> non-admin user gets to access the sample project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KYLIN-5003) Fail to package Kylin due to legancy front end dependencies

2021-06-01 Thread nichunen (Jira)

nichunen created KYLIN-5003:
---

 Summary: Fail to package Kylin due to legancy front end 
dependencies
 Key: KYLIN-5003
 URL: https://issues.apache.org/jira/browse/KYLIN-5003
 Project: Kylin
  Issue Type: Bug
  Components: Web 
Affects Versions: v3.1.2
Reporter: nichunen
Assignee: nichunen
 Fix For: v3.1.3


I tried to package Kylin with build/script/package.sh, and it failed with error
{quote}bower angular-unstable#1.1.5   ECMDERR Failed to 
execute "git ls-remote --tags --heads 
https://github.com/johannestroeger/bower-angular-unstable.git";, exit code of 
#128 remote: Repository not found. fatal: repository 
'https://github.com/johannestroeger/bower-angular-unstable.git/' not 
found{quote}

And the dependency "phantomjs" can not be installed with "npm install 
phantomjs" now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-5003) Fail to package Kylin due to legacy front end dependencies

2021-06-01 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-5003:

Summary: Fail to package Kylin due to legacy front end dependencies  (was: 
Fail to package Kylin due to legancy front end dependencies)

> Fail to package Kylin due to legacy front end dependencies
> --
>
> Key: KYLIN-5003
> URL: https://issues.apache.org/jira/browse/KYLIN-5003
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Affects Versions: v3.1.2
>Reporter: nichunen
>Assignee: nichunen
>Priority: Major
> Fix For: v3.1.3
>
>
> I tried to package Kylin with build/script/package.sh, and it failed with 
> error
> {quote}bower angular-unstable#1.1.5   ECMDERR Failed 
> to execute "git ls-remote --tags --heads 
> https://github.com/johannestroeger/bower-angular-unstable.git";, exit code of 
> #128 remote: Repository not found. fatal: repository 
> 'https://github.com/johannestroeger/bower-angular-unstable.git/' not 
> found{quote}
> And the dependency "phantomjs" can not be installed with "npm install 
> phantomjs" now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KYLIN-4206) Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, the hive table cannot be found

2020-01-05 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen reassigned KYLIN-4206:
---

Assignee: nichunen  (was: rongchuan.jin)

> Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, 
> the hive table cannot be found
> ---
>
> Key: KYLIN-4206
> URL: https://issues.apache.org/jira/browse/KYLIN-4206
> Project: Kylin
>  Issue Type: Bug
>  Components: Environment 
>Affects Versions: v2.6.4
> Environment: EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark 
> 2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
> kylin 2.6.4
>Reporter: rongneng.wei
>Assignee: nichunen
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: kylin.properties, kylin_hive_conf.xml, kylin_job_conf.xml
>
>
> hi，
>    I  Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the 
> cube, the hive table cannot be found.The detailed error information is as 
> follows：
> java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found)java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found) at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
>  at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> On the EMR, hive metadata is shared by glue, and the URL of Metastore is 
> configured in hive-site.xml.
> hive.metastore.uris
>  thrift://ip-172-40-15-164.ec2.internal:9083
>  JDBC connect string for a JDBC metastore
>  
> 
>  hive.metastore.client.factory.class
>  
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
>  
> But when I use hive's own metadata, that is, don't use glue to share 
> metadata, the above exception will not occur, comment out the following 
> configuration.
> 
> But since EMR uses shared metadata, if you don't use metadata sharing, then I 
> can't query other hive tables built by the cluster.
> The configuration file is detailed in the attachment. Please help me solve 
> this problem.Thank you。
> Best regard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KYLIN-4206) Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, the hive table cannot be found

2020-01-05 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen reassigned KYLIN-4206:
---

Assignee: rongchuan.jin

> Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, 
> the hive table cannot be found
> ---
>
> Key: KYLIN-4206
> URL: https://issues.apache.org/jira/browse/KYLIN-4206
> Project: Kylin
>  Issue Type: Bug
>  Components: Environment 
>Affects Versions: v2.6.4
> Environment: EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark 
> 2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
> kylin 2.6.4
>Reporter: rongneng.wei
>Assignee: rongchuan.jin
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: kylin.properties, kylin_hive_conf.xml, kylin_job_conf.xml
>
>
> hi，
>    I  Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the 
> cube, the hive table cannot be found.The detailed error information is as 
> follows：
> java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found)java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found) at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
>  at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> On the EMR, hive metadata is shared by glue, and the URL of Metastore is 
> configured in hive-site.xml.
> hive.metastore.uris
>  thrift://ip-172-40-15-164.ec2.internal:9083
>  JDBC connect string for a JDBC metastore
>  
> 
>  hive.metastore.client.factory.class
>  
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
>  
> But when I use hive's own metadata, that is, don't use glue to share 
> metadata, the above exception will not occur, comment out the following 
> configuration.
> 
> But since EMR uses shared metadata, if you don't use metadata sharing, then I 
> can't query other hive tables built by the cluster.
> The configuration file is detailed in the attachment. Please help me solve 
> this problem.Thank you。
> Best regard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4327) TOPN Comparator may violate its general contract

2020-01-06 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008817#comment-17008817
 ] 

nichunen commented on KYLIN-4327:
-

[~Yifei_Wu94] Would you please describe the detail of the issue, and the way 
you fix this?

> TOPN Comparator may  violate its general contract
> -
>
> Key: KYLIN-4327
> URL: https://issues.apache.org/jira/browse/KYLIN-4327
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KYLIN-4206) Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, the hive table cannot be found

2020-01-07 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen reassigned KYLIN-4206:
---

Assignee: rongneng.wei  (was: nichunen)

> Build kylin on EMR 5.23. The kylin version is 2.6.4. When building the cube, 
> the hive table cannot be found
> ---
>
> Key: KYLIN-4206
> URL: https://issues.apache.org/jira/browse/KYLIN-4206
> Project: Kylin
>  Issue Type: Bug
>  Components: Environment 
>Affects Versions: v2.6.4
> Environment: EMR 5.23(hadoop 2.8.5\HBase 1.4.9\hive 2.3.4\Spark 
> 2.4.0\Tez 0.9.1\HCatalog 2.3.4\Zookeeper 3.4.13)
> kylin 2.6.4
>Reporter: rongneng.wei
>Assignee: rongneng.wei
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: kylin.properties, kylin_hive_conf.xml, kylin_job_conf.xml
>
>
> hi，
>    I  Build kylin on EMR 5.23. The kylin version is 2.6.4.When building the 
> cube, the hive table cannot be found.The detailed error information is as 
> follows：
> java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found)java.lang.RuntimeException: java.io.IOException: 
> NoSuchObjectException(message:kylin_flat_db_test1.kylin_intermediate_kylin_sales_cube_4e93b31d_3be2_c9e8_55de_a9814f63c4ba
>  table not found) at 
> org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:83)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:126)
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:104)
>  at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> On the EMR, hive metadata is shared by glue, and the URL of Metastore is 
> configured in hive-site.xml.
> hive.metastore.uris
>  thrift://ip-172-40-15-164.ec2.internal:9083
>  JDBC connect string for a JDBC metastore
>  
> 
>  hive.metastore.client.factory.class
>  
> com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory
>  
> But when I use hive's own metadata, that is, don't use glue to share 
> metadata, the above exception will not occur, comment out the following 
> configuration.
> 
> But since EMR uses shared metadata, if you don't use metadata sharing, then I 
> can't query other hive tables built by the cluster.
> The configuration file is detailed in the attachment. Please help me solve 
> this problem.Thank you。
> Best regard.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010248#comment-17010248
 ] 

nichunen commented on KYLIN-4321:
-

[~codingforfun] Any comparison data for the step of "Create fact distinct 
columns" with MR and Spark?

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4330) use nrt streaming build for kafka data, can we use filter function when i set model desinger

2020-01-07 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010305#comment-17010305
 ] 

nichunen commented on KYLIN-4330:
-

[~kangkingkang] Thanks for your reporting, I'll check it.

> use nrt streaming build for kafka data, can we use filter function when i set 
> model desinger 
> -
>
> Key: KYLIN-4330
> URL: https://issues.apache.org/jira/browse/KYLIN-4330
> Project: Kylin
>  Issue Type: New Feature
>  Components: NRT Streaming
>Affects Versions: v2.6.4
> Environment: 阿里云 centos 7   hadoop 2.8.5 
>Reporter: kangkang
>Priority: Major
>  Labels: FIlter, model
> Fix For: v2.6.4
>
> Attachments: 4561578452903_.pic_hd.jpg
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> when i use nrt streaming build for kafka data, can we use filter function 
> when i set model desinger 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4333) Build Server OOM

2020-01-10 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012695#comment-17012695
 ] 

nichunen commented on KYLIN-4333:
-

(y)

> Build Server OOM
> 
>
> Key: KYLIN-4333
> URL: https://issues.apache.org/jira/browse/KYLIN-4333
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine, Query Engine
>Affects Versions: v3.0.0, v3.0.0-beta, v3.0.0-alpha2
>Reporter: wangxiaojing
>Priority: Major
>
> Kylin 3 frequently appears full GC or even OOM, and build server appears OOM 
> almost every 4 days. Query server also has full GC, but it appears less 
> frequently than build server.
> We have about 2000 cubes on this Kylin cluster (3 build servers and 3 query 
> servers , 32g memory per server)。
> Through mat analysis of dump, there may be a memory leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4322) Cost–benefit of compression HBase result

2020-01-14 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4322:

Fix Version/s: v3.1.0

> Cost–benefit of compression HBase result
> 
>
> Key: KYLIN-4322
> URL: https://issues.apache.org/jira/browse/KYLIN-4322
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Priority: Major
> Fix For: v3.1.0
>
>
> kylin.storage.hbase.endpoint-compress-result is  TRUE as default.
> In our production environment, when the hbase scan result is larger than 
> 200M, it will take more than 10s to compress data.
> We can find this by hbase's log:
> ||Size||avg rate||min rate||avg time||max time||
> |<1M|0.12|0.25|0.18ms|0.7s|
> |1M ~ 10M|0.39|0.97|0.2s|0.6s|
> |10M ~ 100M|0.47|0.81|2s|6.3s|
> |>100M|0.95|0.96|15.7s|24.8s|
> Notice：
>  # rate: compressed data size / origin data size
>  # when the source data size is < 1M, compressed data may larger than the 
> source data. So the table(Row 1) only calculate then compressed data less 
> than the source data
>  # In our environment, 65% compression data (<1M) is larger than source data 
> When source data is less then 10M, the latency of data transmission is 
> acceptability. When data is larger then 100M, it will take a long time to 
> compress data.
>  
> So, I think kylin.storage.hbase.endpoint-compress-result  should be FALSE by 
> default;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KYLIN-4322) Cost–benefit of compression HBase result

2020-01-14 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen resolved KYLIN-4322.
-
Resolution: Fixed

> Cost–benefit of compression HBase result
> 
>
> Key: KYLIN-4322
> URL: https://issues.apache.org/jira/browse/KYLIN-4322
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Priority: Major
> Fix For: v3.1.0
>
>
> kylin.storage.hbase.endpoint-compress-result is  TRUE as default.
> In our production environment, when the hbase scan result is larger than 
> 200M, it will take more than 10s to compress data.
> We can find this by hbase's log:
> ||Size||avg rate||min rate||avg time||max time||
> |<1M|0.12|0.25|0.18ms|0.7s|
> |1M ~ 10M|0.39|0.97|0.2s|0.6s|
> |10M ~ 100M|0.47|0.81|2s|6.3s|
> |>100M|0.95|0.96|15.7s|24.8s|
> Notice：
>  # rate: compressed data size / origin data size
>  # when the source data size is < 1M, compressed data may larger than the 
> source data. So the table(Row 1) only calculate then compressed data less 
> than the source data
>  # In our environment, 65% compression data (<1M) is larger than source data 
> When source data is less then 10M, the latency of data transmission is 
> acceptability. When data is larger then 100M, it will take a long time to 
> compress data.
>  
> So, I think kylin.storage.hbase.endpoint-compress-result  should be FALSE by 
> default;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KYLIN-4322) Cost–benefit of compression HBase result

2020-01-14 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen reassigned KYLIN-4322:
---

Assignee: ZhouKang

> Cost–benefit of compression HBase result
> 
>
> Key: KYLIN-4322
> URL: https://issues.apache.org/jira/browse/KYLIN-4322
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Assignee: ZhouKang
>Priority: Major
> Fix For: v3.1.0
>
>
> kylin.storage.hbase.endpoint-compress-result is  TRUE as default.
> In our production environment, when the hbase scan result is larger than 
> 200M, it will take more than 10s to compress data.
> We can find this by hbase's log:
> ||Size||avg rate||min rate||avg time||max time||
> |<1M|0.12|0.25|0.18ms|0.7s|
> |1M ~ 10M|0.39|0.97|0.2s|0.6s|
> |10M ~ 100M|0.47|0.81|2s|6.3s|
> |>100M|0.95|0.96|15.7s|24.8s|
> Notice：
>  # rate: compressed data size / origin data size
>  # when the source data size is < 1M, compressed data may larger than the 
> source data. So the table(Row 1) only calculate then compressed data less 
> than the source data
>  # In our environment, 65% compression data (<1M) is larger than source data 
> When source data is less then 10M, the latency of data transmission is 
> acceptability. When data is larger then 100M, it will take a long time to 
> compress data.
>  
> So, I think kylin.storage.hbase.endpoint-compress-result  should be FALSE by 
> default;
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4333) Build Server OOM

2020-01-14 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4333:

Fix Version/s: v3.1.0

> Build Server OOM
> 
>
> Key: KYLIN-4333
> URL: https://issues.apache.org/jira/browse/KYLIN-4333
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine, Query Engine
>Affects Versions: v3.0.0, v3.0.0-beta, v3.0.0-alpha2
>Reporter: wangxiaojing
>Assignee: wangxiaojing
>Priority: Major
> Fix For: v3.1.0
>
>
> Kylin 3 frequently appears full GC or even OOM, and build server appears OOM 
> almost every 4 days. Query server also has full GC, but it appears less 
> frequently than build server.
> We have about 2000 cubes on this Kylin cluster (3 build servers and 3 query 
> servers , 32g memory per server)。
> Through mat analysis of dump, there may be a memory leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KYLIN-4333) Build Server OOM

2020-01-14 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen resolved KYLIN-4333.
-
Resolution: Fixed

[~wangxiaojing] PR merged, thanks very much

> Build Server OOM
> 
>
> Key: KYLIN-4333
> URL: https://issues.apache.org/jira/browse/KYLIN-4333
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine, Query Engine
>Affects Versions: v3.0.0, v3.0.0-beta, v3.0.0-alpha2
>Reporter: wangxiaojing
>Assignee: wangxiaojing
>Priority: Major
> Fix For: v3.1.0
>
>
> Kylin 3 frequently appears full GC or even OOM, and build server appears OOM 
> almost every 4 days. Query server also has full GC, but it appears less 
> frequently than build server.
> We have about 2000 cubes on this Kylin cluster (3 build servers and 3 query 
> servers , 32g memory per server)。
> Through mat analysis of dump, there may be a memory leak.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KYLIN-4327) TOPN Comparator may violate its general contract

2020-01-15 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen resolved KYLIN-4327.
-
Resolution: Fixed

> TOPN Comparator may  violate its general contract
> -
>
> Key: KYLIN-4327
> URL: https://issues.apache.org/jira/browse/KYLIN-4327
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Minor
> Fix For: v2.6.5, v3.1.0
>
>
> in current TopN， it  should save top(k) result and keep it in double value, 
> just like this:
> ```
> public class Counter implements Serializable{
> ...
> protected T item;
> protected double count;
> ...
> }
> ```
> But its Comparator method use  "=="  directly to compare the result, it may 
> cause the error  "*violate its general contract*" when calling it.
> ```
>   private static final Comparator ASC_COMPARATOR = new 
> Comparator() {
> @Override
> public int compare(Counter o1, Counter o2) {
> return o1.getCount() > o2.getCount() ? 1 : o1.getCount() == 
> o2.getCount() ? 0 : -1;
> }
> };
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4327) TOPN Comparator may violate its general contract

2020-01-15 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4327:

Fix Version/s: v3.1.0
   v2.6.5

> TOPN Comparator may  violate its general contract
> -
>
> Key: KYLIN-4327
> URL: https://issues.apache.org/jira/browse/KYLIN-4327
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Minor
> Fix For: v2.6.5, v3.1.0
>
>
> in current TopN， it  should save top(k) result and keep it in double value, 
> just like this:
> ```
> public class Counter implements Serializable{
> ...
> protected T item;
> protected double count;
> ...
> }
> ```
> But its Comparator method use  "=="  directly to compare the result, it may 
> cause the error  "*violate its general contract*" when calling it.
> ```
>   private static final Comparator ASC_COMPARATOR = new 
> Comparator() {
> @Override
> public int compare(Counter o1, Counter o2) {
> return o1.getCount() > o2.getCount() ? 1 : o1.getCount() == 
> o2.getCount() ? 0 : -1;
> }
> };
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KYLIN-4293) Backport HBASE-22887 to Kylin HFileOutputFormat3

2020-01-16 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen resolved KYLIN-4293.
-
Resolution: Fixed

> Backport HBASE-22887 to Kylin HFileOutputFormat3
> 
>
> Key: KYLIN-4293
> URL: https://issues.apache.org/jira/browse/KYLIN-4293
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Shao Feng Shi
>Assignee: langdamao
>Priority: Major
> Fix For: v2.6.5, v3.1.0
>
>
> As Kylin forked HBase's HFileOutputFormat2 as HFileOutputFormat3, so this 
> bugfix need be applied in Kylin:
> https://issues.apache.org/jira/browse/HBASE-22887
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KYLIN-4354) Prune segment not using given filter when using jdbc preparestatement

2020-01-18 Thread nichunen (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018775#comment-17018775
 ] 

nichunen commented on KYLIN-4354:
-

[~itzhangqiang]Hi, thanks for your reporting? Will you submit a PR to fix?

> Prune segment not using given filter when using jdbc preparestatement
> -
>
> Key: KYLIN-4354
> URL: https://issues.apache.org/jira/browse/KYLIN-4354
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: v2.6.2
>Reporter: QiangZhang
>Priority: Major
> Attachments: image-2020-01-19-10-34-39-637.png, 
> image-2020-01-19-10-36-27-864.png
>
>
> When use jdbc preparestatement query kylin，prune segment not using given 
> filter(lead to scan all segments)
> !image-2020-01-19-10-36-27-864.png!
> !image-2020-01-19-10-34-39-637.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KYLIN-4294) Add http api for metrics

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen reassigned KYLIN-4294:
---

Assignee: xiang zhang

> Add http api for metrics 
> -
>
> Key: KYLIN-4294
> URL: https://issues.apache.org/jira/browse/KYLIN-4294
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Affects Versions: v2.6.0
>Reporter: xiang zhang
>Assignee: xiang zhang
>Priority: Minor
> Attachments: kylin-4294-instruction.pdf
>
>
> # Expose metrics through http api to facilitate the integration of some 
> external monitoring components, such as tsdb
>  # add a python script for tcollector



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4294) Add http api for metrics

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4294:

Fix Version/s: v3.1.0

> Add http api for metrics 
> -
>
> Key: KYLIN-4294
> URL: https://issues.apache.org/jira/browse/KYLIN-4294
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Affects Versions: v2.6.0
>Reporter: xiang zhang
>Assignee: xiang zhang
>Priority: Minor
> Fix For: v3.1.0
>
> Attachments: kylin-4294-instruction.pdf
>
>
> # Expose metrics through http api to facilitate the integration of some 
> external monitoring components, such as tsdb
>  # add a python script for tcollector



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4305) Streaming Receiver cannot limit income query request or cancel long-running query

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4305:

Fix Version/s: v3.1.0

> Streaming Receiver cannot limit income query request or cancel long-running 
> query
> -
>
> Key: KYLIN-4305
> URL: https://issues.apache.org/jira/browse/KYLIN-4305
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: Jietu20191217-221025.png, after_repair_receiver.jstack, 
> image-2019-12-17-22-12-01-098.png, streaming_receiver_jstack.log
>
>
> When under heavy load(high rate of query request), receiver can not stand it, 
> and most quey may timeout, but the query processing thread cannot be 
> cancelled in receiver side, which will cause receiver's crash. You have to 
> restart it.
> kylin.log
> {code:java}
> Caused by: java.lang.RuntimeException: timeout when call stream rpc
>   at 
> org.apache.kylin.storage.stream.rpc.HttpStreamDataSearchClient$QueuedStreamingTupleIterator.hasNext(HttpStreamDataSearchClient.java:298)
>   at com.google.common.collect.Iterators$5.hasNext(Iterators.java:596)
>   at 
> org.apache.kylin.metadata.tuple.CompoundTupleIterator.hasNext(CompoundTupleIterator.java:52)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:63)
>   at Baz$1$1.moveNext(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>   at 
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>   at Baz.bind(Unknown Source)
>   at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
>   at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
>   at 
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>   at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>   at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>   ... 83 more
> {code}
> jstack
> {code:java}
> "qtp1901663135-48" #48 prio=5 os_prio=0 tid=0x7f839995f800 nid=0x3cdd 
> runnable [0x7f83674fe000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Thread.yield(Native Method)
>   at 
> org.apache.kylin.stream.core.query.MultiThreadsResultCollector$1.hasNext(MultiThreadsResultCollector.java:75)
>   at 
> org.apache.kylin.stream.core.query.RecordsAggregator.aggregate(RecordsAggregator.java:100)
>   at 
> org.apache.kylin.stream.core.query.StreamingCubeDataSearcher$StreamAggregateSearchResult.iterator(StreamingCubeDataSearcher.java:191)
>   at 
> org.apache.kylin.stream.server.rest.controller.DataController.query(DataController.java:119)
>   at sun.reflect.GeneratedMethodAccessor87.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
>   at 
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
>   at 
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java

[jira] [Resolved] (KYLIN-4305) Streaming Receiver cannot limit income query request or cancel long-running query

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen resolved KYLIN-4305.
-
Resolution: Fixed

> Streaming Receiver cannot limit income query request or cancel long-running 
> query
> -
>
> Key: KYLIN-4305
> URL: https://issues.apache.org/jira/browse/KYLIN-4305
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: Jietu20191217-221025.png, after_repair_receiver.jstack, 
> image-2019-12-17-22-12-01-098.png, streaming_receiver_jstack.log
>
>
> When under heavy load(high rate of query request), receiver can not stand it, 
> and most quey may timeout, but the query processing thread cannot be 
> cancelled in receiver side, which will cause receiver's crash. You have to 
> restart it.
> kylin.log
> {code:java}
> Caused by: java.lang.RuntimeException: timeout when call stream rpc
>   at 
> org.apache.kylin.storage.stream.rpc.HttpStreamDataSearchClient$QueuedStreamingTupleIterator.hasNext(HttpStreamDataSearchClient.java:298)
>   at com.google.common.collect.Iterators$5.hasNext(Iterators.java:596)
>   at 
> org.apache.kylin.metadata.tuple.CompoundTupleIterator.hasNext(CompoundTupleIterator.java:52)
>   at 
> org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:63)
>   at Baz$1$1.moveNext(Unknown Source)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy_(EnumerableDefaults.java:825)
>   at 
> org.apache.calcite.linq4j.EnumerableDefaults.groupBy(EnumerableDefaults.java:761)
>   at 
> org.apache.calcite.linq4j.DefaultEnumerable.groupBy(DefaultEnumerable.java:302)
>   at Baz.bind(Unknown Source)
>   at 
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePrepare.java:365)
>   at 
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionImpl.java:301)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java:559)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:550)
>   at 
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:182)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:67)
>   at 
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:44)
>   at 
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:667)
>   at 
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.java:619)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:675)
>   at 
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:156)
>   ... 83 more
> {code}
> jstack
> {code:java}
> "qtp1901663135-48" #48 prio=5 os_prio=0 tid=0x7f839995f800 nid=0x3cdd 
> runnable [0x7f83674fe000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Thread.yield(Native Method)
>   at 
> org.apache.kylin.stream.core.query.MultiThreadsResultCollector$1.hasNext(MultiThreadsResultCollector.java:75)
>   at 
> org.apache.kylin.stream.core.query.RecordsAggregator.aggregate(RecordsAggregator.java:100)
>   at 
> org.apache.kylin.stream.core.query.StreamingCubeDataSearcher$StreamAggregateSearchResult.iterator(StreamingCubeDataSearcher.java:191)
>   at 
> org.apache.kylin.stream.server.rest.controller.DataController.query(DataController.java:119)
>   at sun.reflect.GeneratedMethodAccessor87.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
>   at 
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
>   at 
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:9

[jira] [Updated] (KYLIN-4330) use nrt streaming build for kafka data, can we use filter function when i set model desinger

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4330:

Fix Version/s: (was: v2.6.4)
   v2.6.5

> use nrt streaming build for kafka data, can we use filter function when i set 
> model desinger 
> -
>
> Key: KYLIN-4330
> URL: https://issues.apache.org/jira/browse/KYLIN-4330
> Project: Kylin
>  Issue Type: New Feature
>  Components: NRT Streaming
>Affects Versions: v2.6.4
> Environment: 阿里云 centos 7   hadoop 2.8.5 
>Reporter: kangkang
>Priority: Major
>  Labels: FIlter, model
> Fix For: v2.6.5
>
> Attachments: 4561578452903_.pic_hd.jpg
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> when i use nrt streaming build for kafka data, can we use filter function 
> when i set model desinger 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KYLIN-4329) use nrt streaming build kafka data,but the segment cannot merge

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen updated KYLIN-4329:

Fix Version/s: (was: v2.6.4)
   v2.6.5

> use  nrt streaming build kafka data,but the segment  cannot merge
> -
>
> Key: KYLIN-4329
> URL: https://issues.apache.org/jira/browse/KYLIN-4329
> Project: Kylin
>  Issue Type: Bug
>  Components: NRT Streaming
>Affects Versions: v2.6.4
> Environment: kylin  2.6.4
>Reporter: kangkang
>Priority: Critical
>  Labels: build
> Fix For: v2.6.5
>
> Attachments: 1981578450672_.pic.jpg, 2001578450672_.pic_hd.jpg
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> i use near streaming build for kafka data ,build once an hour
> , I also set the time for automatic merge
> but the segment  cannot merge, i want to know why ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3608) Move dependency versions to top level pom properties

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3608.
---

> Move dependency versions to top level pom properties
> 
>
> Key: KYLIN-3608
> URL: https://issues.apache.org/jira/browse/KYLIN-3608
> Project: Kylin
>  Issue Type: Task
>  Components: Others
>Affects Versions: v2.6.1
>Reporter: Ted Yu
>Assignee: zhoujie
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> There are some non-top level pom.xml files where dependency version is 
> referenced directly.
> core-common/pom.xml is an example.
> We should move all dependency versions to top level pom properties



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4068) Automatically add limit has bug

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4068.
---

> Automatically add limit has bug
> ---
>
> Key: KYLIN-4068
> URL: https://issues.apache.org/jira/browse/KYLIN-4068
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-alpha2
>
> Attachments: image-2019-09-19-16-47-40-292.png
>
>
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> {code}
> will convert to 
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> LIMIT 5
> {code}
> This limit is not working on the result of union, but on SELECT E_Name FROM 
> Employees_USA.
> We should use a more secure way to achieve the limit effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4035) Calculate column cardinality by using spark engine

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4035.
---

> Calculate column cardinality by using spark engine
> --
>
> Key: KYLIN-4035
> URL: https://issues.apache.org/jira/browse/KYLIN-4035
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
> Environment: kylin: master/3.0.0-alpha
> spark: 2.4.3
> hadoop: 2.6.5
>Reporter: Jack
>Assignee: Jack
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> Kylin will calculate column cardinality when loading hive table. This stage 
> is only supported by MR engine without spark. I think spark engine should be 
> used in this stage because of the following:
> 1) Kylin users can choose which engine they apply when calculating column 
> cardinality;
> 2) Some good spark features(e.g. dynamic resource allocation) can be used; 
> 3) The code written in spark is simple.
> I finish this work and test ok. But "kylin.engine.spark-cardinality=true" 
> should be added in kylin.properties(default is false). Look forwards to 
> suggestions.
> Best regards. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4012) optimize cache in TrieDictionary/TrieDictionaryForest

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4012.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> optimize cache in TrieDictionary/TrieDictionaryForest
> -
>
> Key: KYLIN-4012
> URL: https://issues.apache.org/jira/browse/KYLIN-4012
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v2.5.0
>Reporter: jiezouSH
>Assignee: jiezouSH
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Currently, 
> CacheDictionary, parent class of TrieDictionary and TrieDictionaryForest, has 
> 3 cache: 
> valueToIdCache, 
> idToValueCache, 
> idToValueByteCache.
> However, The function of idToValueCache and idToValueByteCache is highly 
> duplicated.
> It is better to merge idToValueCache and idToValueByteCache, saving memory in 
> about half the dict's physical size



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3843) List kylin instances with their server mode on web

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3843.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> List kylin instances with their server mode on web
> --
>
> Key: KYLIN-3843
> URL: https://issues.apache.org/jira/browse/KYLIN-3843
> Project: Kylin
>  Issue Type: New Feature
>  Components: REST Service, Web 
>Reporter: nichunen
>Assignee: Jiatao Tao
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> As Curator-based scheduler is available now, so Kylin can list all nodes with 
> the same metadata url.
> This task should include some rest apis to fetch nodes information on ZK, and 
> front page on System page to display the nodes information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4031) RestClient will throw exception with message contains clear-text password

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4031.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> RestClient will throw exception with message contains clear-text password
> -
>
> Key: KYLIN-4031
> URL: https://issues.apache.org/jira/browse/KYLIN-4031
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Hi dear kylin team:
>   I found that RestClient:97 will throw IllegalArgumentException with 
> message contains clear-text password when set error uri with user:pwd. This 
> may casue some security problem, I think.
>   How do you think about this?
>   
>  Best Regards
>   
>yuzhang



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3959) Realtime OLAP query result should not be cached

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3959.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Realtime OLAP query result should not be cached
> ---
>
> Key: KYLIN-3959
> URL: https://issues.apache.org/jira/browse/KYLIN-3959
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3960) Only update user when login in LDAP environment

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3960.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Only update user when login in LDAP environment
> ---
>
> Key: KYLIN-3960
> URL: https://issues.apache.org/jira/browse/KYLIN-3960
> Project: Kylin
>  Issue Type: Improvement
>  Components: Security
>Reporter: Jiatao Tao
>Assignee: Jiatao Tao
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3932) KafkaConfigOverride to take effect

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3932.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> KafkaConfigOverride to take effect
> --
>
> Key: KYLIN-3932
> URL: https://issues.apache.org/jira/browse/KYLIN-3932
> Project: Kylin
>  Issue Type: Improvement
>Reporter: jinguowei
>Assignee: jinguowei
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3813) don't do push down when both of the children of CompareTupleFilter are CompareTupleFilter with column included

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3813.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> don't do push down when both of the children of CompareTupleFilter are 
> CompareTupleFilter with column included
> --
>
> Key: KYLIN-3813
> URL: https://issues.apache.org/jira/browse/KYLIN-3813
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> When dynamic column is enabled, kylin will try to push down group by case 
> when to hbase. However, in the following case, the push down should not be 
> enabled, since currently it's not well supported for CompareTupleFilter to 
> have a child of CompareTupleFilter.
> Sample SQL:
> {code}
> select colA
>case
>when (colB = (1 = 1)) = (colC = (1 = 1)) then 'B&C'
>when (colC = (1 = 1)) = (colD = (1 = 1)) then 'C&D'
>else 'n/a'
>end as phase,
>count(*)
> from T
> where session_date between '2018-08-01' and '2018-08-31'
> group by colA
>case
>when (colB = (1 = 1)) = (colC = (1 = 1)) then 'B&C'
>when (colC = (1 = 1)) = (colD = (1 = 1)) then 'C&D'
>else 'n/a'
>end;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3841) Build Global Dict by MR/Hive

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3841.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Build Global Dict by MR/Hive
> 
>
> Key: KYLIN-3841
> URL: https://issues.apache.org/jira/browse/KYLIN-3841
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.6.1
>Reporter: jinguowei
>Assignee: jinguowei
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3912) Support cube level mapreduce queue config for BeelineHiveClient

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3912.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Support cube level mapreduce queue config for BeelineHiveClient
> ---
>
> Key: KYLIN-3912
> URL: https://issues.apache.org/jira/browse/KYLIN-3912
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v2.6.1
>Reporter: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> To support multi tenants, we set different mapreduce queue config for 
> different projects and cubes, but BeelineHiveClient don't use those configs. 
> So the getHiveTableRows api always run on same queue in kylin_hive_conf or 
> jdbc url, which cause computing resource competition.
>  
> {code:java}
> 2018-11-28 15:37:27,261 ERROR [Scheduler 1950398337 Job 
> 08b3ee43-c84d-4039-84c5-a36ecb2cff18-228] execution.AbstractExecutable:383 : 
> job:08b3ee43-c84d-4039-84c5-a36ecb2cff18-01 execute finished with exception
> java.sql.SQLException: Error while processing statement: FAILED: Execution 
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> downstreamError is null.
> Query log: 
> http://zjy-hadoop-prc-ct14.bj:28911/log?qid=a05e1629-2072-46dd-9d71-b5722d04b2aa
> at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:277)
> at 
> org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:376)
> at 
> org.apache.kylin.source.hive.BeelineHiveClient.getHiveTableRows(BeelineHiveClient.java:108)
> at 
> org.apache.kylin.source.hive.HiveMRInput$RedistributeFlatHiveTableStep.computeRowCount(HiveMRInput.java:304)
> at 
> org.apache.kylin.source.hive.HiveMRInput$RedistributeFlatHiveTableStep.doWork(HiveMRInput.java:354)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:165)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:67)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:165)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:300)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3946) No cube for AVG measure after include count column

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3946.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> No cube for AVG measure after include count column
> --
>
> Key: KYLIN-3946
> URL: https://issues.apache.org/jira/browse/KYLIN-3946
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.1
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Before, avg(col) will use sum(col)/count(1) to calculate.
> After including count column aggregation KYLIN-3883, avg(col) will use 
> sum(col)/count(col) to calculate. 
> If there is no predefined count(col) measure, query with avg(col) will get 
> exception "NoRealizationFoundException: No realization found for 
> OLAPContext", which will effect the query on old cubes. So we should consider 
> compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3997) Add a health check job of Kylin

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3997.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Add a health check job of Kylin
> ---
>
> Key: KYLIN-3997
> URL: https://issues.apache.org/jira/browse/KYLIN-3997
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Kylin has many inner meta data and outer dependencies. There may be 
> inconsistent for bugs or failures. It's better to have a a health check job 
> to find these inconsistent issues in advance。
> The inconsistent issues we found in our clusters are followings
>  * {color:#808080}the cubeid data not exist for cube merging{color}
>  * {color:#808080}hbase table not exist or online for a segment{color}
>  * {color:#808080}there are holes in cube segments(The build of some days 
> failed, but user not found it){color}
>  * {color:#808080}Too many segment(hbase tables){color}
>  * {color:#808080}metadata of stale segment  left in cube{color}
>  * {color:#808080}Some cubes have not be updated/built for a long time{color}
>  * {color:#808080}Some  important parameters are no set in cube desc{color}
>  * {color:#808080}...{color}
>  Suggestions are welcomed, thanks~



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3942) Rea-time OLAP don't support multi-level json event

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3942.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Rea-time OLAP don't support multi-level json event
> --
>
> Key: KYLIN-3942
> URL: https://issues.apache.org/jira/browse/KYLIN-3942
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Critical
> Fix For: v3.0.0-alpha2
>
>
> Currently real-time OLAP didn't support  multi-level json event.
> For example,if I have a kafka multi-level json event like this:
> {quote}{"country":"JAPAN","amount":13.075058425023922,"qty":8,"currency":"USD","order_time":1554801950882,"category":"ELECTRONIC","device":"Andriod","user":\{"gender":"Female","id":"7a0cfa5e-bbaa-79ef-1a38-e06f02c85fcb","first_name":"unknown","age":16}}
> {quote}
>  
> Receiver will throw exception like this and discard that event:
>  
> {quote}2019-04-09 09:46:09,878 ERROR [StreamingV2Cube_channel] 
> kafka.TimedJsonStreamParser:107 : error
> com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot 
> deserialize instance of `java.lang.String` out of START_OBJECT token
>  at [Source: 
> (String)"\{"country":"US","amount":14.498498222823619,"qty":1,"currency":"USD","order_time":1554803169876,"category":"Other","device":"Other","user":{"gender":"Female","id":"0736b41a-9ae7-9b4a-a124-f74436d3eb41","first_name":"unknown","age":26}}";
>  line: 1, column: 140] (through reference chain: java.util.HashMap["user"])
>  at 
> com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:63)
>  at 
> com.fasterxml.jackson.databind.DeserializationContext.reportInputMismatch(DeserializationContext.java:1342)
>  at 
> com.fasterxml.jackson.databind.DeserializationContext.handleUnexpectedToken(DeserializationContext.java:1138)
>  at 
> com.fasterxml.jackson.databind.DeserializationContext.handleUnexpectedToken(DeserializationContext.java:1092)
>  at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:63)
>  at 
> com.fasterxml.jackson.databind.deser.std.StringDeserializer.deserialize(StringDeserializer.java:10)
>  at 
> com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:527)
>  at 
> com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:364)
>  at 
> com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:29)
>  at 
> com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4001)
>  at 
> com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3030)
>  at 
> org.apache.kylin.stream.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:79)
>  at 
> org.apache.kylin.stream.source.kafka.TimedJsonStreamParser.parse(TimedJsonStreamParser.java:54)
>  at 
> org.apache.kylin.stream.source.kafka.consumer.KafkaConnector.nextEvent(KafkaConnector.java:110)
>  at 
> org.apache.kylin.stream.core.consumer.StreamingConsumerChannel.run(StreamingConsumerChannel.java:93)
>  at java.lang.Thread.run(Thread.java:748)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3958) MrHive-Dict support build by livy

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3958.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> MrHive-Dict support build by livy
> -
>
> Key: KYLIN-3958
> URL: https://issues.apache.org/jira/browse/KYLIN-3958
> Project: Kylin
>  Issue Type: Improvement
>Reporter: jinguowei
>Assignee: jinguowei
>Priority: Major
> Fix For: v3.0.0-alpha2
>
> Attachments: build_steps.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4001) Allow user-specified time format using real-time

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4001.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Allow user-specified time format using real-time
> 
>
> Key: KYLIN-4001
> URL: https://issues.apache.org/jira/browse/KYLIN-4001
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: ning guo
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> * I found that real-time only supports millisecond timestamp, does not 
> support second timestamp and  Date type like '2019-01-01 11:11:11'.
>  * I add a LongTimeParser and a DateTimeParser and page configuration 
>  * You can configure tsParser, tsPattern on the page that creates the 
> streaming table.
>  * for date :
> {code:java}
> { "timestamp":"2019-04-29 11:11:11","gmv":1.1 }
> You can specify
> tsParser=org.apache.kylin.stream.source.kafka.DateTimeParser
> tsPattern=-MM-dd HH:mm:ss{code}
>  
>  * for second :
> {code:java}
> { "timestamp":"1556618887","gmv":1.1 }
> You can specify
> tsParser=org.apache.kylin.stream.source.kafka.LongTimeParser
> tsPattern=S{code}
>  
>  * for millisecond :
> {code:java}
> { "timestamp":"1556618887000","gmv":1.1 }
> You can specify
> tsParser=org.apache.kylin.stream.source.kafka.LongTimeParser
> tsPattern=MS
> {code}
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4033) Can not access Kerberized Cluster with DebugTomcat

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4033.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Can not access Kerberized Cluster with DebugTomcat
> --
>
> Key: KYLIN-4033
> URL: https://issues.apache.org/jira/browse/KYLIN-4033
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: all
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> When I start the Kylin Server using DebugTomcat, the cubing job will fail 
> because of "GSSException: No valid credentials provided (Mechanism level: 
> Failed to find any Kerberos tgt)]"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3935) ZKUtil acquire the wrong Zookeeper Path on windows

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3935.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> ZKUtil acquire the wrong Zookeeper Path on windows
> --
>
> Key: KYLIN-3935
> URL: https://issues.apache.org/jira/browse/KYLIN-3935
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.6.1
>Reporter: Na Zhai
>Assignee: Na Zhai
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> In my windows env, when I run Kylin, Kylin service can't start. In class 
> ZKUtil, I found Kylin use File(path).getCanonicalPath() to norm windows path, 
> however this will get a path like ' C:\kylin\kylin_metadata'.For Zookeeper, 
> the path must start with / character.
> {color:#FF}return new File(path).toURI().getPath(){color} might be 
> better!!!
> Below is stackTrace:
> : java.lang.RuntimeException: 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: Fail to check or create znode for chRoot 
> F:\kylin\kylin_metadata_idea due to 
>  at org.apache.kylin.common.util.ZKUtil.getZookeeperClient(ZKUtil.java:137)
>  at org.apache.kylin.common.util.ZKUtil.getZookeeperClient(ZKUtil.java:115)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock$Factory.(ZookeeperDistributedLock.java:57)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock$Factory.(ZookeeperDistributedLock.java:53)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.(ZookeeperJobLock.java:32)
>  at 
> org.apache.kylin.rest.service.JobService.afterPropertiesSet(JobService.java:132)
>  at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1687)
>  at 
> org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1624)
>  ... 61 more
> Caused by: com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: Fail to check or create znode for chRoot 
> F:\kylin\kylin_metadata_idea due to 
>  at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2263)
>  at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
>  at 
> com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4789)
>  at org.apache.kylin.common.util.ZKUtil.getZookeeperClient(ZKUtil.java:123)
>  ... 68 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3918) Add project name in cube and job pages

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3918.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Add project name in cube and job pages
> --
>
> Key: KYLIN-3918
> URL: https://issues.apache.org/jira/browse/KYLIN-3918
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> In a production cluster, there will be many projects and each project has 
> many cubes. It's useful to show project name in cube and job pages.
> So the admin can be quick to known which project the abnormal cube or failed 
> job belongs to and get contact with the users.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4005) Saving Cube of a aggregation Groups(40 Dimensions, Max Dimension Combination:5) may cause kylin server OOM

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4005.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Saving Cube of a aggregation Groups(40 Dimensions, Max Dimension 
> Combination:5) may cause kylin server OOM
> --
>
> Key: KYLIN-4005
> URL: https://issues.apache.org/jira/browse/KYLIN-4005
> Project: Kylin
>  Issue Type: Bug
>  Components: REST Service
>Affects Versions: v2.5.2
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Critical
> Fix For: v3.0.0-alpha2
>
>
> A user try to save a cube with a aggregation Groups(40 Dimensions, Max 
> Dimension Combination:5) caused the kylin server OOM. The reason is that the 
> DefaultCuboidScheduler will cost a lot memory when calculating all cube ids. 
> The stack is following
> {code}
> http-bio-7070-exec-35
>   at java.lang.OutOfMemoryError.()V (OutOfMemoryError.java:48)
>   at java.util.HashMap.resize()[Ljava/util/HashMap$Node; (HashMap.java:704)
>   at 
> java.util.HashMap.putVal(ILjava/lang/Object;Ljava/lang/Object;ZZ)Ljava/lang/Object;
>  (HashMap.java:663)
>   at 
> java.util.HashMap.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; 
> (HashMap.java:612)
>   at java.util.HashSet.add(Ljava/lang/Object;)Z (HashSet.java:220)
>   at java.util.AbstractCollection.addAll(Ljava/util/Collection;)Z 
> (AbstractCollection.java:344)
>   at 
> org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.getOnTreeParentsByLayer(Ljava/util/Collection;)Ljava/util/Set;
>  (DefaultCuboidScheduler.java:240)
>   at 
> org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.buildTreeBottomUp()Lorg/apache/kylin/common/util/Pair;
>  (DefaultCuboidScheduler.java:183)
>   at 
> org.apache.kylin.cube.cuboid.DefaultCuboidScheduler.(Lorg/apache/kylin/cube/model/CubeDesc;)V
>  (DefaultCuboidScheduler.java:58)
>   at 
> sun.reflect.GeneratedConstructorAccessor140.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
>  (Unknown Source)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
>  (DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.lang.reflect.Constructor.newInstance([Ljava/lang/Object;)Ljava/lang/Object;
>  (Constructor.java:423)
>   at 
> org.apache.kylin.cube.cuboid.CuboidScheduler.getInstance(Lorg/apache/kylin/cube/model/CubeDesc;)Lorg/apache/kylin/cube/cuboid/CuboidScheduler;
>  (CuboidScheduler.java:41)
>   at 
> org.apache.kylin.cube.model.CubeDesc.getInitialCuboidScheduler()Lorg/apache/kylin/cube/cuboid/CuboidScheduler;
>  (CubeDesc.java:750)
>   at 
> org.apache.kylin.cube.cuboid.CuboidCLI.simulateCuboidGeneration(Lorg/apache/kylin/cube/model/CubeDesc;Z)I
>  (CuboidCLI.java:47)
>   at 
> org.apache.kylin.rest.service.CubeService.updateCubeAndDesc(Lorg/apache/kylin/cube/CubeInstance;Lorg/apache/kylin/cube/model/CubeDesc;Ljava/lang/String;Z)Lorg/apache/kylin/cube/model/CubeDesc;
>  (CubeService.java:287)
>   at 
> org.apache.kylin.rest.service.CubeService$$FastClassBySpringCGLIB$$17a07c0e.invoke(ILjava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (Unknown Source)
>   at 
> org.springframework.cglib.proxy.MethodProxy.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (MethodProxy.java:204)
>   at 
> org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(Ljava/lang/Object;Ljava/lang/reflect/Method;[Ljava/lang/Object;Lorg/springframework/cglib/proxy/MethodProxy;)Ljava/lang/Object;
>  (CglibAopProxy.java:669)
>   at 
> org.apache.kylin.rest.service.CubeService$$EnhancerBySpringCGLIB$$34de75c4.updateCubeAndDesc(Lorg/apache/kylin/cube/CubeInstance;Lorg/apache/kylin/cube/model/CubeDesc;Ljava/lang/String;Z)Lorg/apache/kylin/cube/model/CubeDesc;
>  (Unknown Source)
>   at 
> org.apache.kylin.rest.controller.CubeController.updateCubeDesc(Lorg/apache/kylin/rest/request/CubeRequest;)Lorg/apache/kylin/rest/request/CubeReq
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4017) Build engine get zk(zookeeper) lock failed when building job, it causes the whole build engine doesn't work.

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4017.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Build engine get zk(zookeeper) lock failed when building job, it causes the 
> whole build engine doesn't work.
> 
>
> Key: KYLIN-4017
> URL: https://issues.apache.org/jira/browse/KYLIN-4017
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine, Tools, Build and Test
>Affects Versions: v3.0.0, v3.0.0-alpha
>Reporter: wangxiaojing
>Priority: Critical
>  Labels: build
> Fix For: v3.0.0-alpha2
>
> Attachments: zkinstancestart.png
>
>
> Kylin has ZK acquisition lock exception when it is building job. Only restart 
> can solve this problem. Otherwise, it can't build job ,the whole build engine 
> doesn't work.This problem will continue to occur one day after restart. Log 
> looks like below:
> {code:java}
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] 
> threadpool.FetcherRunner:59 : 
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - 
> es_report_respnse_rate_cube - 2019051300_2019051400 - GMT+08:00 
> 2019-05-15 11:03:15, state=READY} prepare to schedule and its priority is 20
> 2019-05-15 11:09:43,209 INFO [FetcherRunner 1910115020-57] 
> threadpool.FetcherRunner:63 : 
> CubingJob{id=878974c4-4c65-88a4-a912-b238fcc33bdc, name=BUILD CUBE - 
> es_report_respnse_rate_cube - 2019051300_2019051400 - GMT+08:00 
> 2019-05-15 11:03:15, state=READY} scheduled
> 2019-05-15 11:09:43,209 DEBUG [Scheduler 719764581 Job 
> 878974c4-4c65-88a4-a912-b238fcc33bdc-132] 
> zookeeper.ZookeeperDistributedLock:92 : 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com trying to lock 
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
> 2019-05-15 11:09:43,212 ERROR [pool-12-thread-10] 
> threadpool.DistributedScheduler:115 : unknown error execute 
> job:878974c4-4c65-88a4-a912-b238fcc33bdc in server: 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com
> java.lang.IllegalStateException: Error while 
> 18...@bigdata-kylin-build01.gz01.diditaxi.com trying to lock 
> /job_engine/lock/878974c4-4c65-88a4-a912-b238fcc33bdc
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:99)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperJobLock.lock(ZookeeperJobLock.java:41)
>  at 
> org.apache.kylin.job.impl.threadpool.DistributedScheduler$JobRunner.run(DistributedScheduler.java:105)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: instance must be started before 
> calling this method
>  at 
> org.apache.curator.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:176)
>  at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.create(CuratorFrameworkImpl.java:351)
>  at 
> org.apache.kylin.job.lock.zookeeper.ZookeeperDistributedLock.lock(ZookeeperDistributedLock.java:95)
>  ... 5 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4026) Avoid too many file append operations in HiveProducer of hive metrics reporter

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4026.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Avoid too many file append operations in HiveProducer of hive metrics reporter
> --
>
> Key: KYLIN-4026
> URL: https://issues.apache.org/jira/browse/KYLIN-4026
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Currently  for each write in HiveProducer, there will be a hdfs append 
> operation, which is heavy for HDFS. 
> A improvement is to keep a FSDataOutputStream in  HiveProducer and write data 
> to it continuously 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4027) Kylin-jdbc module has tcp resource leak

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4027.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Kylin-jdbc module has tcp resource leak
> ---
>
> Key: KYLIN-4027
> URL: https://issues.apache.org/jira/browse/KYLIN-4027
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - JDBC
>Affects Versions: all
>Reporter: Hongsen Liu
>Priority: Major
>  Labels: easyfix
> Fix For: v3.0.0-alpha2
>
>
> In Kylin-jdbc module, the class KylinClient has TCP resource leak when it 
> sends  http request. For example,  like the following code snipper
>  
> {quote}HttpResponse response = httpClient.execute(post);
> try {
>      if (response.getStatusLine().getStatusCode() != 200 &&          
> response.getStatusLine().getStatusCode() != 201) {
>            throw asIOException(post, response);
>       }
>         SQLResponseStub stub =  jsonMapper.readValue(           
> response.getEntity().getContent(), SQLResponseStub.class);
>           return stub;
> } finally {
>          post.releaseConnection();
> }
> {quote}
> The code HttpClient.execute(post) is not in try segment, if it throws 
> exception internally ,
> the finally segment won't run.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3812) optimize the child CompareTupleFilter in a CompareTupleFilter

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3812.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> optimize the child CompareTupleFilter in a CompareTupleFilter
> -
>
> Key: KYLIN-3812
> URL: https://issues.apache.org/jira/browse/KYLIN-3812
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Currently it's not well supported for CompareTupleFilter to have a child of 
> CompareTupleFilter. However, in some cases, it's better to support it.
> {code}
> where (colA = (1=1))
> {code}
> The *(1=1)* can be transformed to "true". And then this filter can be pushed 
> down to hbase. Otherwise, the filter *(colA = (1=1))* does not work in hbase.
> And it may return incorrect results for the following SQL:
> {code}
> select colA
>case
>when colB = (1 = 1) then 'B'
>when colC = (1 = 1) then 'C'
>when colD = (1 = 1) then 'D'
>else 'n/a'
>end as phase,
>count(*)
> from T
> where session_date between '2018-08-01' and '2018-08-31'
> group by colA
>case
>when colB = (1 = 1) then 'B'
>when colC = (1 = 1) then 'C'
>when colD = (1 = 1) then 'D'
>else 'n/a'
>end;
> {code}
> In the final result, all of the keys will become 'B'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4028) Speed up startup progress using cached dependency

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4028.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Speed up startup progress using cached dependency
> -
>
> Key: KYLIN-4028
> URL: https://issues.apache.org/jira/browse/KYLIN-4028
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Affects Versions: all
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
> Fix For: v3.0.0-alpha2
>
>
> The hive/hadoop/hbase dependencies are not volatile, and finding the 
> dependencies every time I start the Kylin server will slow down the startup 
> speed.
> So, if there are dependencies generated by previous running, we can use it to 
> start the server without finding the dependencies again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3925) Add reduce step for FilterRecommendCuboidDataJob & UpdateOldCuboidShardJob to avoid generating small hdfs files

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3925.
---

Resolved in release v3.0.0-alpha2(2019-07-31)

> Add reduce step for FilterRecommendCuboidDataJob & UpdateOldCuboidShardJob to 
> avoid generating small hdfs files
> ---
>
> Key: KYLIN-3925
> URL: https://issues.apache.org/jira/browse/KYLIN-3925
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Previously when doing cube optimization, there're two map only MR jobs: 
> *FilterRecommendCuboidDataJob* & *UpdateOldCuboidShardJob*. The benefit of 
> map only job is to avoid shuffling. However, this benefit will bring a more 
> severe issue, too many small hdfs files.
> Suppose there're 10 hdfs files for current cuboids data and each with 500M. 
> If the block size is 100M, there'll be 10*(500/100) mappers for the map only 
> job *FilterRecommendCuboidDataJob*. Each mapper will generate a hdfs file. 
> Finally there'll be 50 hdfs files. Since the job 
> *FilterRecommendCuboidDataJob* will filter out the cuboid data used for 
> future, the data size of each file will be less than 100M. In some cases, it 
> will be even less than 50M.
> To avoid this kind of small hdfs file issue, it's better to add a reduce step 
> to control the final output hdfs file number.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4133) support override configuration in kafka job

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4133.
---

Resolved in release v3.0.0-beta(2019-10-25)

> support override configuration in kafka job
> ---
>
> Key: KYLIN-4133
> URL: https://issues.apache.org/jira/browse/KYLIN-4133
> Project: Kylin
>  Issue Type: Improvement
>  Components: NRT Streaming
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> we can't override 'kafka.split.rows' config in kafka job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4128) Remove never called methods

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4128.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Remove never called methods
> ---
>
> Key: KYLIN-4128
> URL: https://issues.apache.org/jira/browse/KYLIN-4128
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v3.0.0-alpha2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> I found some methods never called by *FindBugs* plugin, we should remove 
> these to make code more clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4137) Accelerate metadata reloading

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4137.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Accelerate metadata reloading
> -
>
> Key: KYLIN-4137
> URL: https://issues.apache.org/jira/browse/KYLIN-4137
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v2.6.2
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> Now, org.apache.kylin.metadata.cachesync.CachedCrudAssist#reloadAll is using 
> an inappropriate method to deal with MySQLJdbcMetadata.
> As the every call of reloadAt(path) will access the database, which is almost 
> fine to HBase but really unfriendly to the MySQL(RDBMS).
> I think we should get all of the resource with single request instead of the 
> separate request to get every resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4108) Show slow query hit cube in slow query page

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4108.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Show slow query hit cube in slow query page
> ---
>
> Key: KYLIN-4108
> URL: https://issues.apache.org/jira/browse/KYLIN-4108
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metrics
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4150) Improve docker for kylin instructions

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4150.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Improve docker for kylin instructions
> -
>
> Key: KYLIN-4150
> URL: https://issues.apache.org/jira/browse/KYLIN-4150
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4089) Integration test failed with JDBCMetastore

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4089.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Integration test failed with JDBCMetastore
> --
>
> Key: KYLIN-4089
> URL: https://issues.apache.org/jira/browse/KYLIN-4089
> Project: Kylin
>  Issue Type: Bug
>  Components: Integration, Tools, Build and Test
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.822 
> s <<< FAILURE! - in 
> org.apache.kylin.storage.hbase.ITAclTableMigrationToolTest[ERROR] 
> testBasic(org.apache.kylin.storage.hbase.ITAclTableMigrationToolTest)  Time 
> elapsed: 0.812 s  <<< ERROR!java.lang.NullPointerException
>   at 
> org.apache.kylin.storage.hbase.ITAclTableMigrationToolTest.testBasic(ITAclTableMigrationToolTest.java:95)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4067) Speed up response of kylin cube page

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4067.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Speed up response of kylin cube page
> 
>
> Key: KYLIN-4067
> URL: https://issues.apache.org/jira/browse/KYLIN-4067
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web 
>Affects Versions: v3.0.0-alpha, v2.6.3
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Minor
> Fix For: v3.0.0-beta
>
> Attachments: duplicated_cube_name_request.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi Team：
> My Kylin project has more than 100 cubes. It is very slowly when open kylin 
> web page.
> I find that kylin load all information of cubes  when open cube page. It load 
> all information of all cubes at every cube in kylin project. The url of 
> request is "http://ip:port/kylin/api/cubes?limit=65535&offset=0";. For 
> example, if one project has 10 cubes, this request will be called by 10 
> times.  But this information is only be used to determine whether the name is 
> duplicated when adding a new cube. 
> This page loading mechanism can be optimized. Getting all the information of 
> all cube only needs to be called when adding a new cube.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4095) Add RESOURCE_PATH_PREFIX option in ResourceTool

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4095.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Add RESOURCE_PATH_PREFIX option in ResourceTool
> ---
>
> Key: KYLIN-4095
> URL: https://issues.apache.org/jira/browse/KYLIN-4095
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-10-16-37-603.png, 
> image-2019-09-20-10-17-31-459.png
>
>
> ResourceTool is very useful to fix the metadata with overlap segments.
> But downloading and uploading entire metadata is too heavy.
> It's better to have a RESOURCE_PATH_PREFIX option for downloading and 
> uploading cmds.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4085) Segment parallel building may cause segment not found

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4085.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Segment parallel building may cause segment not found
> -
>
> Key: KYLIN-4085
> URL: https://issues.apache.org/jira/browse/KYLIN-4085
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> In the case of multi-node and parallel building one same cube, the 
> JDBCResourceStore split the update metadata step in two sql, which can't 
> guarantee the atomicity.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4092) Support setting seperate jvm params for kylin backgroud tools

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4092.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Support setting seperate jvm params for kylin backgroud tools
> -
>
> Key: KYLIN-4092
> URL: https://issues.apache.org/jira/browse/KYLIN-4092
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> Usually, the memory set in setenv.sh for query server is larger then 8G, 
> which is not suitable for kylin background tools (meta cleaup, storage 
> cleanup, health check) 
> So It's better to have a seperate env for kylin tools



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4149) Allow user to edit streaming v2 table's kafka cluster address and topic name

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4149.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Allow user to edit streaming v2 table's  kafka cluster address and topic name
> -
>
> Key: KYLIN-4149
> URL: https://issues.apache.org/jira/browse/KYLIN-4149
> Project: Kylin
>  Issue Type: Improvement
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-16-19-44-24-499.png, 
> image-2019-09-16-19-49-20-648.png, image-2019-09-16-19-51-12-214.png, 
> image-2019-09-16-19-51-36-338.png, image-2019-09-16-19-57-45-117.png, 
> image-2019-09-16-20-02-07-687.png, image-2019-09-16-20-04-51-179.png, 
> image-2019-09-16-20-07-11-847.png, image-2019-09-16-20-09-08-218.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4117) Intersect_count() return wrong result when column type is time

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4117.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Intersect_count() return wrong result when column type is time
> --
>
> Key: KYLIN-4117
> URL: https://issues.apache.org/jira/browse/KYLIN-4117
> Project: Kylin
>  Issue Type: Bug
>Reporter: Yaqian Zhang
>Assignee: Xiaoxiang Yu
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> *Intersect_count()* return wrong result when column type is *time.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4122) Add kylin user and group manage modules

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4122.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Add kylin user and group manage modules
> ---
>
> Key: KYLIN-4122
> URL: https://issues.apache.org/jira/browse/KYLIN-4122
> Project: Kylin
>  Issue Type: New Feature
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4127) Remove never called classes

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4127.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Remove never called classes
> ---
>
> Key: KYLIN-4127
> URL: https://issues.apache.org/jira/browse/KYLIN-4127
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v3.0.0-alpha2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> I found some classes never called by FindBugs plugin, we should remove these 
> to make code more clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4114) Provided a self-contained docker image for Kylin

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4114.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Provided a self-contained docker image for Kylin
> 
>
> Key: KYLIN-4114
> URL: https://issues.apache.org/jira/browse/KYLIN-4114
> Project: Kylin
>  Issue Type: New Feature
>  Components: Integration
>Reporter: Xiaoxiang Yu
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> Provided a self-contained docker image for Kylin will benifit to 
> Package/Integration Test/Demo purpose.  
> https://issues.apache.org/jira/browse/KYLIN-4040



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4173) cube list search can not work

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4173.
---

Resolved in release v3.0.0-beta(2019-10-25)

> cube list search can not work
> -
>
> Key: KYLIN-4173
> URL: https://issues.apache.org/jira/browse/KYLIN-4173
> Project: Kylin
>  Issue Type: Bug
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> In the font-end, service/cube.js add a new path "cubeName" in version   
> d449335d68f01270aa0b6a8093ca12daff4b74bd like 
> ''cubes/:cubeId/:propName/:propValue/:action/:cubeName''.
> this caused the list cube api "cubes?cubeName=xx" change to "cubes/xx" and 
> match the wrong api.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-2820) Query can't read window function's result from subquery

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-2820.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Query can't read window function's result from subquery
> ---
>
> Key: KYLIN-2820
> URL: https://issues.apache.org/jira/browse/KYLIN-2820
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.1.0
>Reporter: Mu Kong
>Assignee: nichunen
>Priority: Major
>  Labels: scope
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-19-16-23-39-972.png
>
>
> I executed a query like the follows:
> {code:sql}
> select first_page_name, count(*) as page_name_count from
> (
> select first_value(page_name) over(partition by session_id) as 
> first_page_name from some_db.some_table
> ) group by first_page_name;
> {code}
> This query resulted in one single record with an empty string as the 
> first_page_name, and a big number as the page_name_count.
> However, when I ran the subquery without the outer query, it resulted in 
> multiple records with different first_page_name, which proved the query above 
> was wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4107) StorageCleanupJob fails to delete Hive tables with "Argument list too long" error

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4107.
---

Resolved in release v3.0.0-beta(2019-10-25)

> StorageCleanupJob fails to delete Hive tables with "Argument list too long" 
> error
> -
>
> Key: KYLIN-4107
> URL: https://issues.apache.org/jira/browse/KYLIN-4107
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.6.2
> Environment: CentOS 7.6, HDP 2.6.5, Kylin 2.6.3
>Reporter: Vsevolod Ostapenko
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> On a system with multiple Kylin developers that experiment with cube design 
> and (re)build/drop cube segments often intermediate Hive tables and HBase 
> left over tables accumulate very quickly.
> After a certain point storage cleanup cannot be executed using suggested 
> method:
> {{${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete 
> true}}
> Apparently, storage cleanup job creates a single shell command to drop all 
> Hive tables, which fails to execute because command line is just too long. 
> For example:
> {quote}
> 2019-07-23 17:47:31,611 ERROR [main] job.StorageCleanupJob:377 : Error during 
> deleting Hive tables
> java.io.IOException: Cannot run program "/bin/bash": error=7, Argument list 
> too long
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.runNativeCommand(CliCommandExecutor.java:133)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:89)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:83)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.deleteHiveTables(StorageCleanupJob.java:409)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTableInternal(StorageCleanupJob.java:375)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:278)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanup(StorageCleanupJob.java:151)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.execute(StorageCleanupJob.java:145)
>  at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>  at org.apache.kylin.tool.StorageCleanupJob.main(StorageCleanupJob.java:27)
> Caused by: java.io.IOException: error=7, Argument list too long
>  at java.lang.UNIXProcess.forkAndExec(Native Method)
>  at java.lang.UNIXProcess.(UNIXProcess.java:247)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>  ... 10 more 
> {quote}
> Instead of composing one long command, storage cleanup need to generate a 
> script and feed that into beeline or hive CLI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4154) Metadata inconsistency between multi Kylin server caused by Broadcaster closing

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4154.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Metadata inconsistency between multi Kylin server caused by Broadcaster 
> closing 
> 
>
> Key: KYLIN-4154
> URL: https://issues.apache.org/jira/browse/KYLIN-4154
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> To avoid Broadcaster memory leak, KYLIN-4131 close the metadata sync thread 
> after receiving "Sync All" event, but there may be some events in the waiting 
> queue, which haven't been synced yet. So we should sync all events before 
> closing sync thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4143) truncate spark executable job output

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4143.
---

Resolved in release v3.0.0-beta(2019-10-25)

> truncate spark executable job output 
> -
>
> Key: KYLIN-4143
> URL: https://issues.apache.org/jira/browse/KYLIN-4143
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.5.2
>Reporter: ZhouKang
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: KYLIN-4143.master.001.patch
>
>
>  
> truncate spark job's output when the job exec ret is not equal 0, which made 
> the execute output content too large.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3519) Upgrade Jacoco version to 0.8.2

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3519.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Upgrade Jacoco version to 0.8.2
> ---
>
> Key: KYLIN-3519
> URL: https://issues.apache.org/jira/browse/KYLIN-3519
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: weibin0516
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> Jacoco 0.8.2 adds Java 11 support:
>https://github.com/jacoco/jacoco/releases/tag/v0.8.2
> Java 11 RC1 is out.
> We should consider upgrading Jacoco.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3901) Use multi threads to speed up the storage cleanup job

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3901.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Use multi threads to speed up the storage cleanup job
> -
>
> Key: KYLIN-3901
> URL: https://issues.apache.org/jira/browse/KYLIN-3901
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> Currently, the storage cleanup job only use one thread to clean up hbase 
> table,  hive table, and hdfs dirs.
> It''s better to use multi threads to speed it up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4101) set hive and spark job name when building cube

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4101.
---

Resolved in release v3.0.0-beta(2019-10-25)

> set hive and spark job name when building cube
> --
>
> Key: KYLIN-4101
> URL: https://issues.apache.org/jira/browse/KYLIN-4101
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Minor
> Fix For: v3.0.0-beta
>
>
> Currently the job name of spark is 
> {color:#22}org.apache.kylin.common.util.SparkEntry{color}, which is the 
> main class name of spark . The mapreduce job name of hive sql is substring of 
> the query, which is difficult to read.
> It's better to set a more readable name for the hive and spark jobs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4155) Cube status can not change immediately when executed disable or enable button in web

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4155.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Cube status can not change immediately when executed disable or enable button 
> in web
> 
>
> Key: KYLIN-4155
> URL: https://issues.apache.org/jira/browse/KYLIN-4155
> Project: Kylin
>  Issue Type: Bug
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> When user execute disable or enable button in cube list dropdown menu , and 
> then api response was back successfully,  cube status tag can not change. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4148) Execute 'bin/kylin-port-replace-util.sh' to change port will cause the configuration of 'kylin.metadata.url' lost

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4148.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Execute 'bin/kylin-port-replace-util.sh' to change port will cause the 
> configuration of  'kylin.metadata.url' lost
> --
>
> Key: KYLIN-4148
> URL: https://issues.apache.org/jira/browse/KYLIN-4148
> Project: Kylin
>  Issue Type: Bug
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4112) Add hdfs keberos token delegation in Spark to support HBase and MR use different HDFS clusters

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4112.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Add hdfs keberos token delegation in Spark to support HBase and MR use 
> different HDFS clusters
> --
>
> Key: KYLIN-4112
> URL: https://issues.apache.org/jira/browse/KYLIN-4112
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> Currently the SparkExecutable only delegate the token for yarn hdfs cluster, 
> not for the hdfs cluster used by the HBase cluster.
> The spark job of Convert Cuboid Data to HFile will failed for kerberos issue.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4130) Coordinator->StreamingBuildJobStatusChecker thread always hold a old CubeManager

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4130.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Coordinator->StreamingBuildJobStatusChecker thread always hold a old 
> CubeManager
> 
>
> Key: KYLIN-4130
> URL: https://issues.apache.org/jira/browse/KYLIN-4130
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> {code}
> private class StreamingBuildJobStatusChecker implements Runnable {
> private int maxJobTryCnt = 5;
> private CubeManager cubeManager = 
> CubeManager.getInstance(KylinConfig.getInstanceFromEnv());
> private ConcurrentMap ConcurrentSkipListSet> segmentBuildJobMap = Maps
> .newConcurrentMap();
> private CopyOnWriteArrayList pendingCubeName = 
> Lists.newCopyOnWriteArrayList();
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3878) NPE to run sonar analysis

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3878.
---

Resolved in release v3.0.0-beta(2019-10-25)

> NPE to run sonar analysis
> -
>
> Key: KYLIN-3878
> URL: https://issues.apache.org/jira/browse/KYLIN-3878
> Project: Kylin
>  Issue Type: Test
>  Components: Tools, Build and Test
>Reporter: Shao Feng Shi
>Assignee: nichunen
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> mvn sonar:sonar -Dsonar.host.url=https://sonarcloud.io 
> -Dsonar.organization=kylin -e
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 03:13 min
> [INFO] Finished at: 2019-03-15T14:42:16Z
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.sonarsource.scanner.maven:sonar-maven-plugin:3.6.0.1398:sonar 
> (default-cli) on project kylin: null: MojoExecutionException: 
> NullPointerException -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute 
> goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.6.0.1398:sonar 
> (default-cli) on project kylin: null
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:213)
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:154)
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:146)
>  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
>  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
>  at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
>  at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
>  at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
>  at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
>  at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
>  at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
>  at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
>  at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke (Method.java:498)
>  at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
> (Launcher.java:289)
>  at org.codehaus.plexus.classworlds.launcher.Launcher.launch 
> (Launcher.java:229)
>  at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
> (Launcher.java:415)
>  at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException
>  at org.sonarsource.scanner.maven.bootstrap.ScannerBootstrapper.execute 
> (ScannerBootstrapper.java:67)
>  at org.sonarsource.scanner.maven.SonarQubeMojo.execute 
> (SonarQubeMojo.java:104)
>  at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:208)
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:154)
>  at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:146)
>  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
>  at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
>  at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
>  at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
>  at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
>  at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
>  at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
>  at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
>  at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
>  at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
>  at

[jira] [Closed] (KYLIN-4126) cube name validate code cause the wrong judge of streaming type

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4126.
---

Resolved in release v3.0.0-beta(2019-10-25)

> cube name validate code cause the wrong judge of streaming type 
> 
>
> Key: KYLIN-4126
> URL: https://issues.apache.org/jira/browse/KYLIN-4126
> Project: Kylin
>  Issue Type: Bug
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: error.png, normal.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4039) ZookeeperDistributedLock may not release lock when unlock operation was interrupted

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4039.
---

Resolved in release v3.0.0-beta(2019-10-25)

> ZookeeperDistributedLock may not release lock when unlock operation was 
> interrupted
> ---
>
> Key: KYLIN-4039
> URL: https://issues.apache.org/jira/browse/KYLIN-4039
> Project: Kylin
>  Issue Type: Bug
>Reporter: PENG Zhengshuai
>Assignee: PENG Zhengshuai
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> ZookeeperDistributedLock may hold the lock and not release it when the unlock 
> operation was interrupted.
> Because the unlock operation contains two steps: 
> 1. peekLock: get the owner of the lock
> 2. purgeLock: purge the lock if the owner of the lock is the current client.
> If the peekLock step is interrupted, the purgeLock step won't be executed. 
> Thus the lock won't be released.
> Meanwhile, the lock operation should also consider the interrupt cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4139) Compatible old user security xml config when user upgrate new kylin version

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4139.
---

Resolved in release v3.0.0-beta(2019-10-25)

>  Compatible old user security xml config when user upgrate new kylin version
> 
>
> Key: KYLIN-4139
> URL: https://issues.apache.org/jira/browse/KYLIN-4139
> Project: Kylin
>  Issue Type: Improvement
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>
>          
>  Forward compatible
>  # when user config 'spring.profiles.active=testing', sync old version user 
> metadata with user config in kylinSecurity.xml
> Other improvement
>  # Remove the logic of auto trans new user name to uppercase



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4091) support fast mode and simple mode for running CI

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4091.
---

Resolved in release v3.0.0-beta(2019-10-25)

> support fast mode and simple mode for running CI
> 
>
> Key: KYLIN-4091
> URL: https://issues.apache.org/jira/browse/KYLIN-4091
> Project: Kylin
>  Issue Type: Improvement
>Reporter: luguosheng
>Assignee: luguosheng
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> -DfastBuildMode = true //  build or merge job in concurrent queue
> -DsimpleBuildMode = true // skip merge job and just build only one segment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4129) Remove useless code

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4129.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Remove useless code
> ---
>
> Key: KYLIN-4129
> URL: https://issues.apache.org/jira/browse/KYLIN-4129
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v3.0.0-alpha2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4100) Add overall job number statistics in monitor page

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4100.
---

Resolved in release v3.0.0-beta(2019-10-25)

> Add overall job number statistics in monitor page
> -
>
> Key: KYLIN-4100
> URL: https://issues.apache.org/jira/browse/KYLIN-4100
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Minor
> Fix For: v3.0.0-beta
>
> Attachments: x.png
>
>
> Currently it's hard to get pending and running job number in mointor page, we 
> can only continue to click more until the end.
> It's better to have an overall job number statistics in monitor page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4098) Add cube auto merge api

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4098.
---

Resolved in release v3.0.0(2019-12-20)

> Add cube auto merge api
> ---
>
> Key: KYLIN-4098
> URL: https://issues.apache.org/jira/browse/KYLIN-4098
> Project: Kylin
>  Issue Type: New Feature
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Minor
> Fix For: v3.0.0
>
> Attachments: image-2019-09-19-17-11-49-733.png
>
>
> Currently the auto merging of cube is triggered by the event of new segment 
> is ready automatically. When the cluster restart, there may be too many 
> merging job.
> It's better to have a rest api to trigger the merging and make it more 
> controllable.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4096) Make cube metadata validator rules configuable

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4096.
---

Resolved in release v3.0.0(2019-12-20)

> Make cube metadata validator rules configuable
> --
>
> Key: KYLIN-4096
> URL: https://issues.apache.org/jira/browse/KYLIN-4096
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Shaohui Liu
>Assignee: Shaohui Liu
>Priority: Minor
> Fix For: v3.0.0
>
>
> CubeMetadataValidator is very useful to format the cube creation.
> In xiaomi, we implements multi rules to reduce the operation cost.
> eg: ConfOverrideRule which make user set computing queue in cube 
> configuration and forbid to set some configurations like: 
> kylin.query.max-scan-bytes
> So it's better to make the rules configuable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4262) pid in GC filename inconsistent with real pid

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4262.
---

Resolved in release v3.0.0(2019-12-20)

> pid in GC filename inconsistent with real pid
> -
>
> Key: KYLIN-4262
> URL: https://issues.apache.org/jira/browse/KYLIN-4262
> Project: Kylin
>  Issue Type: Bug
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0, v2.6.5
>
> Attachments: image-2019-11-18-17-19-49-059.png, 
> image-2019-11-18-17-19-56-990.png, image-2019-11-19-18-55-18-113.png
>
>
> pid in GC filename
> !image-2019-11-18-17-19-49-059.png!
>  
> real pid
> !image-2019-11-18-17-19-56-990.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4203) Disable a real time cube and then enable it ,this cube may can't submit build job anymore

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4203.
---

Resolved in release v3.0.0(2019-12-20)

> Disable a real time cube and then enable it ,this cube may can't submit build 
> job anymore
> -
>
> Key: KYLIN-4203
> URL: https://issues.apache.org/jira/browse/KYLIN-4203
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: Future
>Reporter: wangxiaojing
>Assignee: wangxiaojing
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2019-10-25-18-35-51-570.png
>
>
> First ,disable a real time streaming cube when the cube has max building jobs 
> (default max  job size is 10),then enable the cube。But this cube may can't 
> subbmit new building jobs any more even if the kylin user have discarded the 
> building jobs ,it logs "No left quota to build segments for cube". Because 
> the amount of left quota one cube can submit building jobs is determined by 
> this algorithm: allowMaxBuildingSegments - inBuildingSegments". The 
> 'allowMaxBuildingSegments' is configed and the 'inBuildingSegments ' are the 
> cube's not ready segments in hbase(perhaps some other storage).
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4193) More user-friendly page for loading streaming tables

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4193.
---

Resolved in release v3.0.0(2019-12-20)

> More user-friendly page for loading streaming tables
> 
>
> Key: KYLIN-4193
> URL: https://issues.apache.org/jira/browse/KYLIN-4193
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: nichunen
>Priority: Major
> Fix For: v3.0.0
>
>
> After click "Add Streaming Table V2", the user has to set "TSParser" and 
> "TSPattern", these items may confuse users, they should be made more 
> user-friendly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4163) CreateFlatHiveTableStep has not yarn app url when hive job running

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4163.
---

Resolved in release v3.0.0(2019-12-20)

> CreateFlatHiveTableStep has not yarn app url when hive job running
> --
>
> Key: KYLIN-4163
> URL: https://issues.apache.org/jira/browse/KYLIN-4163
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine, Web 
>Affects Versions: v3.0.0-alpha
>Reporter: chuxiao
>Priority: Minor
> Fix For: v3.0.0
>
> Attachments: KYLIN-4163.master.001.patch, flathivetablerunning图.jpg
>
>
> CreateFlatHiveTableStep has yarn app url on the monitor web page only when 
> job finished, but SparkExecutable has yarn app url when job running.
> this is because of SparkExecutable`s logger has logger listener:
> {code:java}
> final PatternedLogger patternedLogger = new PatternedLogger(logger, new 
> PatternedLogger.ILogListener() {
>  @Override
>  public void onLogEvent(String infoKey, Map info) {
>  // only care three properties here
>  if (ExecutableConstants.SPARK_JOB_ID.equals(infoKey)
>  || ExecutableConstants.YARN_APP_ID.equals(infoKey)
>  || ExecutableConstants.YARN_APP_URL.equals(infoKey)) {
>  getManager().addJobInfo(getId(), info);
>  }
>  }
>  });{code}
> sometimes creating flat hive table hangs, so user wants to have yarn app url 
> when hive job running  like attachment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4178) Job scheduler support safe mode

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4178.
---

Resolved in release v3.0.0(2019-12-20)

> Job scheduler support safe mode 
> 
>
> Key: KYLIN-4178
> URL: https://issues.apache.org/jira/browse/KYLIN-4178
> Project: Kylin
>  Issue Type: Improvement
>Reporter: ZhouKang
>Priority: Major
> Fix For: v3.0.0
>
> Attachments: image-2019-11-26-18-42-33-473.png
>
>
> Job scheduler should support safe mode in case of the HBase cluster change.
> In xiaomi, we want update the HBase cluster from hbase0.98 to HBase 2.0. The 
> history data can be migrated previously, but the job has been submitted will 
> keep running and write data to the old cluster. So we need a method to ensure 
> that, job create htable in the old cluster will write data to the old 
> cluster, and the job have not create htable should not be scheduled.
> So we need  job scheduler safe mode. Open safe mode before changing cluster 
> config,  the  running jobs can run continuous, and the new job cannot be 
> scheduled. 
> After all running job finished, we can change the cluster config to the new 
> one,  and rest of job can be scheduled again.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4244) ClassNotFoundException while use org.apache.kylin.engine.mr.common.CubeStatsReader in bash

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4244.
---

Resolved in release v3.0.0(2019-12-20)

> ClassNotFoundException while use 
> org.apache.kylin.engine.mr.common.CubeStatsReader in bash
> --
>
> Key: KYLIN-4244
> URL: https://issues.apache.org/jira/browse/KYLIN-4244
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Assignee: ZhouKang
>Priority: Minor
> Fix For: v3.0.0
>
>
> use org.apache.kylin.engine.mr.common.CubeStatsReader to print estimated size 
> for cube
>  
> {code:java}
> // code placeholder
> bash ./kylin.sh org.apache.kylin.engine.mr.common.CubeStatsReader {cube_name}
> {code}
> get an Exception
> {code:java}
> // code placeholder
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> com/tdunning/math/stats/TDigest
>   at 
> org.apache.kylin.measure.percentile.PercentileSerializer.current(PercentileSerializer.java:62)
>   at 
> org.apache.kylin.measure.percentile.PercentileSerializer.getStorageBytesEstimate(PercentileSerializer.java:52)
>   at 
> org.apache.kylin.metadata.datatype.DataType.getStorageBytesEstimate(DataType.java:256)
>   at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.estimateCuboidStorageSize(CubeStatsReader.java:251)
>   at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMapFromRowCount(CubeStatsReader.java:211)
>   at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.getCuboidSizeMap(CubeStatsReader.java:170)
>   at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.print(CubeStatsReader.java:273)
>   at 
> org.apache.kylin.engine.mr.common.CubeStatsReader.main(CubeStatsReader.java:435)
> Caused by: java.lang.ClassNotFoundException: com.tdunning.math.stats.TDigest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 8 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4172) Can't rename field when map streaming schema to table

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4172.
---

Resolved in release v3.0.0(2019-12-20)

> Can't rename field when map streaming schema to table
> -
>
> Key: KYLIN-4172
> URL: https://issues.apache.org/jira/browse/KYLIN-4172
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha2
>Reporter: Peng Huang
>Priority: Major
> Fix For: v3.0.0
>
> Attachments: 微信截图_20190919102424.png
>
>
> When I map streaming schema to table, I don't know how to do it by myself. I 
> have to use auto-mapping by which i can't rename field.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4135) Real time streaming segment build task discard but can't be rebuilt

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4135.
---

Resolved in release v3.0.0(2019-12-20)

> Real time streaming segment build task discard but can't  be rebuilt
> 
>
> Key: KYLIN-4135
> URL: https://issues.apache.org/jira/browse/KYLIN-4135
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming
>Affects Versions: Future
>Reporter: wangxiaojing
>Priority: Major
> Fix For: v3.0.0
>
>
> Real time streaming segment build task discard in some case ,but now can't be 
> rebuilt, and cant't subbmit other new segment tsrange to build if it reach 
> the cube's max building number( 
> [https://issues.apache.org/jira/projects/KYLIN/issues/KYLIN-4134?filter=allopenissues])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4167) Refactor streaming coordinator

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4167.
---

Resolved in release v3.0.0(2019-12-20)

> Refactor streaming coordinator
> --
>
> Key: KYLIN-4167
> URL: https://issues.apache.org/jira/browse/KYLIN-4167
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0
>
>
> h2. Summary
>  # Currently, *coordinator* has too many responsibility, which violate single 
> responsibility principle, and it not easy for extension, a good separation of 
> responsibilities is a recommended way.
>  # Some cluster level operation has no atomicity guarantee, we should 
> implement then in idempotent way to achieve final consistency
>  #  Resubmit when job was discarded
>  # Clarify overall design for realtime OLAP
>  
> h4. StreamingCoordinator
> Facade of coordinator, will controll BuildJobSummitter/ReceiverClusterMangaer 
> and delegate operation to them.
> h4. BuildJobSubmitter
> The main responsibility of BuildJobSubmitter including:
> 1. Try to find candidate segment which ready to submit a build job
> 2. Trace the status of candidate segment's build job and promote segment if 
> it is has met requirements
> h4.  
> h4. ReceiverClusterManager
> This class manage operation related to multi streaming receivers. They are 
> often not atomic and maybe idempotent.
> h4. ClusterStateChecker
> Basic step of this class:
> 1. stop/pause coordinator to avoid underlying concurrency issue
> 2. check inconsistent state of all receiver cluster
> 3. send summary via mail to kylin admin
> 4. if need, call ClusterDoctor to repair inconsistent issue
> h4. ClusterDoctor
> Repair inconsistent state according to result of ClusterStateChecker
>  
> 
> h3. Candidate Segment
> The candidate segments are those segments what can be saw/perceived by 
> streaming coordinator,
> candidate segment could be divided into following state/queue:
> 1. segment which data are uploaded *PARTLY*
> 2. segment which data are uploaded completely and *WAITING* to build
> 3. segment which in *BUILDING* state, job's state should be one of 
> (NEW/RUNNING/ERROR/DISCARD)
> 4. segment which built *succeed* and wait to be delivered to historical part 
> (and to be deleted in realtime part)
> 5. segment which *in historical part*(HBase Ready Segment)
>  
> By design, segment should transfer to next queue in sequential way(shouldn't 
> jump the queue), do not break this.
> h3. Atomicity
> In a multi-step transcation, following acepts should be thought twice:
> 1. should *fail fast* or continue when exception thrown.
> 2. should API(remote call) be *synchronous* or asynchronous
> 3. when transcation failed, could *roll back* always succeed
> 4. transcation should be *idempotent* so when it failed, it could be fixed by 
> retry
>  
> How to ensure whole cluster opreation smoothly without blocking problem. I 
> divided all multi-step transcation into three kinds:
> NotAtomicIdempotent
> NotAtomicAndNotIdempotent
> NonSideEffect



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-1716) leave executing query page action stop bug

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-1716.
---

Resolved in release v3.0.0(2019-12-20)

> leave executing query page action stop bug
> --
>
> Key: KYLIN-1716
> URL: https://issues.apache.org/jira/browse/KYLIN-1716
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Reporter: Jason Zhong
>Assignee: Jason Zhong
>Priority: Minor
> Fix For: v3.0.0, v2.6.5
>
>
> at 'Insight' page, when executing query, if you click to other page like 
> 'Model', will prompt 'You've executing query in current page, are you sure to 
> leave this page?' ,if you click cancel, you still leave query page 
> successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-4208) RT OLAP kylin.stream.node configure optimization support all receiver can have the same config

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-4208.
---

Resolved in release v3.0.0(2019-12-20)

> RT OLAP kylin.stream.node configure optimization support all receiver can 
> have the same config
> --
>
> Key: KYLIN-4208
> URL: https://issues.apache.org/jira/browse/KYLIN-4208
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Reporter: wangxiaojing
>Assignee: wangxiaojing
>Priority: Major
> Fix For: Future, v3.0.0
>
>
>          At present, kylin.stream.node only supports two configuration 
> format: not config(will use native hostname:defaultPort 7070) or config the 
> IP: Port. In product env, the port number usually needs to be changed.
>         If we should change the port ,wo should set like ip:port , it will 
> lead to different configuration files of different nodes of the entire 
> receiver cluster, which will cause inconvenience to online operation and 
> maintenance.
>      We hope to add a configuration method, such as the disposable port, to 
> solve the problem that the port can be customized and the configuration of 
> all receivers can be consistent. At the same time, it is compatible with the 
> previous configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KYLIN-3887) Query with decimal sum measure of double complied failed after KYLIN-3703

2020-01-20 Thread nichunen (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nichunen closed KYLIN-3887.
---

Resolved in release v3.0.0(2019-12-20)

> Query with decimal sum measure of double complied failed after KYLIN-3703
> -
>
> Key: KYLIN-3887
> URL: https://issues.apache.org/jira/browse/KYLIN-3887
> Project: Kylin
>  Issue Type: Bug
>Reporter: Shaohui Liu
>Assignee: nichunen
>Priority: Major
> Fix For: Future, v3.0.0
>
> Attachments: image-2019-05-14-11-19-05-514.png, 
> image-2019-12-02-13-06-21-282.png
>
>
> After KYLIN-3703, Query with decimal sum measure of double complied failed.
> {code:java}
> Caused by: org.codehaus.commons.compiler.CompileException: 
> Line 112, Column 42: Cannot cast "java.math.BigDecimal" to 
> "java.lang.Double"{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1336 matches

Mail list logo