from:"Shaofeng SHI \(JIRA\)"

[jira] [Assigned] (KYLIN-4157) When using PrepareStatement query, functions within WHERE will cause InternalErrorException

2019-09-04 Thread Shaofeng SHI (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4157:
---

Assignee: Marc Wu

Hi Marc, please go ahead; Github PR or git patch are all acceptable. Thank you!

> When using PrepareStatement query, functions within WHERE will cause 
> InternalErrorException
> ---
>
> Key: KYLIN-4157
> URL: https://issues.apache.org/jira/browse/KYLIN-4157
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.3
>Reporter: Marc Wu
>Assignee: Marc Wu
>Priority: Major
> Fix For: v2.6.4
>
> Attachments: image-2019-09-04-15-39-52-867.png, 
> image-2019-09-04-15-39-58-276.png
>
>
> Hi Kylin Team:
> I found a bug when I'm using PreparedStatement query.
> Let me use table KYLIN_SALES to explain the scenario.
> There is a SQL like:
>  select LSTG_FORMAT_NAME, sum(PRICE) from KYLIN_SALES where 
> lower(LSTG_FORMAT_NAME) = 'fp-gtc' group by LSTG_FORMAT_NAME
> In some cases, user doesn't know the LSTG_FORMAT_NAME is upper case or lower 
> case, or they just want to query data ignoring cases.
>  So assume they use lower(LSTG_FORMAT_NAME) = 'fp-gtc', it's a function 
> within the filter.
> When I execute this SQL on Kylin web console, it can get the right result, 
> but when I tried to execute it by PreparedStatement query on Postman, it 
> threw InternalErrorException. !image-2019-09-04-15-39-58-276.png!
>  
> !image-2019-09-04-15-39-52-867.png!  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (KYLIN-3392) Support NULL value in Sum, Max, Min Aggregation

2019-09-04 Thread Shaofeng SHI (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3392:

Fix Version/s: v2.6.4
   v3.0.0
  Summary: Support NULL value in Sum, Max, Min Aggregation  (was: 
support NULL value in Sum, Max, Min Aggregation)

> Support NULL value in Sum, Max, Min Aggregation
> ---
>
> Key: KYLIN-3392
> URL: https://issues.apache.org/jira/browse/KYLIN-3392
> Project: Kylin
>  Issue Type: Bug
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Major
> Fix For: v3.0.0, v2.6.4
>
>
> It is treated as 0 when confronted with NULL value in KYLIN's basic aggregate 
> measure (like sum, max, min). However, to distinguish the NULL value with 0 
> is very necessary.
> It should be like this
> *sum(null, null) = null*
> *sum(null, 1) = 1*
> *max(null, null) = null*
> *max(null, -1) = -1*
> *min(null,  -1）= -1*
>  in accordance with Hive and SparkSQL



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (KYLIN-3392) support NULL value in Sum, Max, Min Aggregation

2019-09-02 Thread Shaofeng SHI (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920660#comment-16920660
 ] 

Shaofeng SHI commented on KYLIN-3392:
-

Related PR: [https://github.com/apache/kylin/pull/819]

> support NULL value in Sum, Max, Min Aggregation
> ---
>
> Key: KYLIN-3392
> URL: https://issues.apache.org/jira/browse/KYLIN-3392
> Project: Kylin
>  Issue Type: Bug
>Reporter: Yifei Wu
>Assignee: Yifei Wu
>Priority: Major
>
> It is treated as 0 when confronted with NULL value in KYLIN's basic aggregate 
> measure (like sum, max, min). However, to distinguish the NULL value with 0 
> is very necessary.
> It should be like this
> *sum(null, null) = null*
> *sum(null, 1) = 1*
> *max(null, null) = null*
> *max(null, -1) = -1*
> *min(null,  -1）= -1*
>  in accordance with Hive and SparkSQL



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (KYLIN-4153) Failed to read big resource /dict/xxxx at "Build Dimension Dictionary" Step

2019-09-01 Thread Shaofeng SHI (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920410#comment-16920410
 ] 

Shaofeng SHI commented on KYLIN-4153:
-

Hi xiaoxiang, from your observation, although the step 2 throws an exception, 
the data was actually inserted successfully, is that true? 

When rollback, how can it ensure the entry be deleted as well? 

> Failed to read big resource  /dict/ at "Build Dimension Dictionary" Step
> 
>
> Key: KYLIN-4153
> URL: https://issues.apache.org/jira/browse/KYLIN-4153
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: v2.6.0
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
>
> At the version of *Kylin 2.6.0*, kylin team has introduce an important 
> refactor of Kylin's Metadata Store, which add a lot of enhancement such as 
> upload/download metadata concurrently, store metadata with JDBC etc. Please 
> refer to https://issues.apache.org/jira/browse/KYLIN-3671 for detail.
>  
> When kylin want to save a *big resource*(such as dict or snapshot) into 
> metadata store, it won't store it into metadata store(HBase or RDBMS) 
> directly. Instead, kylin will first {color:red}save it into HDFS(Step 
> 1){color}, and then {color:red}write a empty byte array as marker into 
> metadata store(Step 2) {color}. If first action succeed and second action 
> failed, a rollback method will be called to revert modification for HDFS 
> files. We could regard it as a complete and atomic transaction.
>  
> {color:#0747A6}Here is part of the source code added in KYLIN-3671.{color} 
> Check it at 
> https://github.com/apache/kylin/blob/8737bc1f555a2789a67462c8f8420b6ab3be97ce/core-common/src/main/java/org/apache/kylin/common/persistence/PushdownResourceStore.java#L58
>  . 
> {code:java}
> final void putBigResource(String resPath, ContentWriter content, long newTS) 
> throws IOException {
> // pushdown the big resource to DFS file
> RollbackablePushdown pushdown = writePushdown(resPath, content); // Step 
> 1: write big resource into HDFS
> try {
> // write a marker in resource store, to indicate the resource is now 
> available
> logger.debug("Writing marker for big resource {}", resPath);
> putResourceWithRetry(resPath, 
> ContentWriter.create(BytesUtil.EMPTY_BYTE_ARRAY), newTS); // Step 2: write 
> marker into HBase/RDBMS
> } catch (Throwable ex) {
> pushdown.rollback();
> throw ex;
> } finally {
> pushdown.close();
> }
> }
> {code}
>  
>  
>  
> But in some case, both step 1 and step 2 succeed but an exception still 
> throwed in step 2,{color:red} the rollback won't clear marker written in Step 
> 2{color}, which break the atomicity of this put action, thus cause the 
> FileNotFoundException when Kylin want to read that dict later.
>  
>  
>  
> {color:#0747A6}Here is part of reporter's kylin.log of incomplete rollback 
> action.{color}
>  
>   
> {noformat}
>  2019-08-29 05:13:51,237 INFO  [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] dict.DictionaryManager:388 : Saving 
> dictionary at 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,238 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:98 : 
> Writing pushdown file 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
> 2019-08-29 05:13:51,256 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:117 : 
> Move 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict.temp.-1798610090
>  to 
> /kylin/kylin_metadata/resources/dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:51,258 DEBUG [Scheduler 169045403 Job 
> ca4a4a08-54e2-b922-70bb-2aa2bf58709f-492] persistence.HDFSResourceStore:65 : 
> Writing marker for big resource 
> /dict/KYLIN_VIEW.USER_SECRET_TABLE/COUNTRY/66292068-e8eb-975a-3e44-b56c933c14cc.dict
> 2019-08-29 05:13:56,263 WARN  
> [hconnection-0x56f3258e-shared--pool10944-t54867] client.AsyncProcess:1263 : 
> #10545, table=kylin_metadata, attempt=1/1 failed=1ops, last exception: 
> java.io.IOException: Call to tx-dn41.data/10.14.243.51:60020 failed on local 
> exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=2662317, 
> waitTime=5001, operationTimeout=5000 expired. on 
> tx-dn41.data,60020,1565943919204, tracking started Thu Aug 29 05:13:51 
> GMT+08:00 2019; not retrying 1 - final failure
> 2019-08-29 05:13:56,266 ERROR [Scheduler 169045403

[jira] [Commented] (KYLIN-4152) Should Disable Before Deleting HBase Table using HBaseAdmin

2019-08-28 Thread Shaofeng SHI (Jira)



[ 
https://issues.apache.org/jira/browse/KYLIN-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917790#comment-16917790
 ] 

Shaofeng SHI commented on KYLIN-4152:
-

Hi Tianhong, this patch doesn't have author information. Could you please 
re-generate with "git format-patch", or create a PR to Kylin's git repository 
on github.com? Thank you!

> Should Disable Before Deleting HBase Table using HBaseAdmin 
> 
>
> Key: KYLIN-4152
> URL: https://issues.apache.org/jira/browse/KYLIN-4152
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.5.2
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>Priority: Major
> Fix For: v2.6.4
>
> Attachments: kylin-4152.patch
>
>
> In LookupTableToHFileJob.java, it should disable before deleting hbase table 
> when using HBaseAdmin API. Otherwise, it will throw Exception.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (KYLIN-4152) Should Disable Before Deleting HBase Table using HBaseAdmin

2019-08-28 Thread Shaofeng SHI (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4152:

Fix Version/s: (was: all)
   v2.6.4

> Should Disable Before Deleting HBase Table using HBaseAdmin 
> 
>
> Key: KYLIN-4152
> URL: https://issues.apache.org/jira/browse/KYLIN-4152
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.5.2
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>Priority: Major
> Fix For: v2.6.4
>
> Attachments: kylin-4152.patch
>
>
> In LookupTableToHFileJob.java, it should disable before deleting hbase table 
> when using HBaseAdmin API. Otherwise, it will throw Exception.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Assigned] (KYLIN-4152) Should Disable Before Deleting HBase Table using HBaseAdmin

2019-08-28 Thread Shaofeng SHI (Jira)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4152:
---

Assignee: Tian Hong Wang

Good catch, thank you Tian hong!

> Should Disable Before Deleting HBase Table using HBaseAdmin 
> 
>
> Key: KYLIN-4152
> URL: https://issues.apache.org/jira/browse/KYLIN-4152
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.5.2
>Reporter: Tian Hong Wang
>Assignee: Tian Hong Wang
>Priority: Major
> Fix For: all
>
> Attachments: kylin-4152.patch
>
>
> In LookupTableToHFileJob.java, it should disable before deleting hbase table 
> when using HBaseAdmin API. Otherwise, it will throw Exception.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (KYLIN-4125) Kylin upgraded from springmvc architecture to spring boot architecture

2019-08-05 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900630#comment-16900630
 ] 

Shaofeng SHI commented on KYLIN-4125:
-

Please also raise a discussion in the d...@kylin.apache.org mailing list. I 
created a branch to stage the code changes: "KYLIN-4125"

> Kylin upgraded from springmvc architecture to spring boot architecture
> --
>
> Key: KYLIN-4125
> URL: https://issues.apache.org/jira/browse/KYLIN-4125
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Minor
>
> Hi Team:
> Kylin is based on the spring mvc architecture, but the spring mvc 
> configuration is more complicated. It is cumbersome when integrateing new 
> components.
> Now， The mainstream of the industry has been based on the spring boot 
> architecture. Spring boot can be automatically configured to reduce the 
> complexity of project integration; promote the expansion and implementation 
> of microservice architecture. More and more project architectures have been 
> upgraded from springmvc to spring boot.
> Kylin can also be upgraded from the springmvc architecture to the spring boot 
> architecture.
> Do you have any suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KYLIN-4125) Kylin upgraded from springmvc architecture to spring boot architecture

2019-08-05 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900618#comment-16900618
 ] 

Shaofeng SHI commented on KYLIN-4125:
-

Jintao, we can create a separate branch (fork from master), and then you can 
raise PR to that branch, just like the "flink_engine" we have today. Does that 
work for you? Thanks!

> Kylin upgraded from springmvc architecture to spring boot architecture
> --
>
> Key: KYLIN-4125
> URL: https://issues.apache.org/jira/browse/KYLIN-4125
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Minor
>
> Hi Team:
> Kylin is based on the spring mvc architecture, but the spring mvc 
> configuration is more complicated. It is cumbersome when integrateing new 
> components.
> Now， The mainstream of the industry has been based on the spring boot 
> architecture. Spring boot can be automatically configured to reduce the 
> complexity of project integration; promote the expansion and implementation 
> of microservice architecture. More and more project architectures have been 
> upgraded from springmvc to spring boot.
> Kylin can also be upgraded from the springmvc architecture to the spring boot 
> architecture.
> Do you have any suggestions?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4121) Cleanup hive view intermediate tables after job be finished

2019-07-31 Thread Shaofeng SHI (JIRA)

Shaofeng SHI created KYLIN-4121:
---

 Summary: Cleanup hive view intermediate tables after job be 
finished
 Key: KYLIN-4121
 URL: https://issues.apache.org/jira/browse/KYLIN-4121
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI


Reported by community user:
I have a cube with a fact table join a lookup table  in hive, and both are hive 
view. I submit a job one time per hour.
 
Kylin can drop the intermediate fact table , but doesn't drop the intermediate 
lookup table .
 
I check the source code , and find out that , at '13 step: Hive Cleanup' ,  the 
source code has been commented.
 
 
It is a legacy issue. Now KYLIN-3515 has fixed that, so the cleanup can be 
enabled.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KYLIN-4113) Remove the surplus allCubes field

2019-07-30 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16896615#comment-16896615
 ] 

Shaofeng SHI commented on KYLIN-4113:
-

The PR is [https://github.com/apache/kylin/pull/780]

 

[~bob123] guosheng, can you help to reviwe this? Thanks!

> Remove the surplus allCubes field
> -
>
> Key: KYLIN-4113
> URL: https://issues.apache.org/jira/browse/KYLIN-4113
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web , Website
>Affects Versions: v2.6.1
> Environment: computery：macOS Mojave 10.14.5
>Reporter: 陈伟双
>Assignee: 陈伟双
>Priority: Major
>  Labels: easyfix
> Attachments: image-2019-07-25-15-26-05-703.png, 
> image-2019-07-25-15-42-39-130.png
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> After selecting the project-name, it will go to the backend to get the cube 
> under the current project, but there will be a request to get all the cubes, 
> even if the backend does not succeed in the permission control, but this is 
> superfluous. I checked and found that all the cubes were obtained only 
> because I need to judge whether there is a cube with the same name when 
> editing or creating the cube. This should not be done in the front end, but 
> the information is submitted to the back end when the cube is created. A 
> unified judgment is made by the back end. Otherwise, there will be a 
> situation where the number of cubes that can be viewed by the current 
> logged-in user's privilege is limited, so it is not complete to determine 
> whether the cube is renamed at the time of creation. This leads to the fact 
> that even if the cube with the same name is created, you can see two cubes 
> with the same name when you can view all cube users (admin). . .
>  The extra code path to get the cube request:
>  
> {code:java}
> webapp/app/js/controllers/cubeSchema.js{code}
>  
> In this position ：
> !image-2019-07-25-15-42-39-130.png!
>  
> This code should not be written directly in the CubeSchemaCtrl controller, 
> otherwise it will trigger the execution of this code when a page references 
> the controller, and there is a case of repeated request to initiate the 
> request. I don't know why. Perhaps it is because the other properties or 
> methods under this controller change to trigger the execution of this code.
>  
> I pass the check backend
>  
> {code:java}
> server-base/src/main/java/org/apache/kylin/rest/controller/CubeController.java{code}
> The file found that there is a ready-made check interface, and I don't know 
> why it is not connected. I refer it to the front end for the request and 
> delete the other redundant code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (KYLIN-4113) Remove the surplus allCubes field

2019-07-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4113:
---

Assignee: 陈伟双

> Remove the surplus allCubes field
> -
>
> Key: KYLIN-4113
> URL: https://issues.apache.org/jira/browse/KYLIN-4113
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web , Website
>Affects Versions: v2.6.1
> Environment: computery：macOS Mojave 10.14.5
>Reporter: 陈伟双
>Assignee: 陈伟双
>Priority: Major
>  Labels: easyfix
> Attachments: image-2019-07-25-15-26-05-703.png, 
> image-2019-07-25-15-42-39-130.png
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> After selecting the project-name, it will go to the backend to get the cube 
> under the current project, but there will be a request to get all the cubes, 
> even if the backend does not succeed in the permission control, but this is 
> superfluous. I checked and found that all the cubes were obtained only 
> because I need to judge whether there is a cube with the same name when 
> editing or creating the cube. This should not be done in the front end, but 
> the information is submitted to the back end when the cube is created. A 
> unified judgment is made by the back end. Otherwise, there will be a 
> situation where the number of cubes that can be viewed by the current 
> logged-in user's privilege is limited, so it is not complete to determine 
> whether the cube is renamed at the time of creation. This leads to the fact 
> that even if the cube with the same name is created, you can see two cubes 
> with the same name when you can view all cube users (admin). . .
>  The extra code path to get the cube request:
>  
> {code:java}
> webapp/app/js/controllers/cubeSchema.js{code}
>  
> In this position ：
> !image-2019-07-25-15-42-39-130.png!
>  
> This code should not be written directly in the CubeSchemaCtrl controller, 
> otherwise it will trigger the execution of this code when a page references 
> the controller, and there is a case of repeated request to initiate the 
> request. I don't know why. Perhaps it is because the other properties or 
> methods under this controller change to trigger the execution of this code.
>  
> I pass the check backend
>  
> {code:java}
> server-base/src/main/java/org/apache/kylin/rest/controller/CubeController.java{code}
> The file found that there is a ready-made check interface, and I don't know 
> why it is not connected. I refer it to the front end for the request and 
> delete the other redundant code.
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Closed] (KYLIN-4039) ZookeeperDistributedLock may not release lock when unlock operation was interrupted

2019-07-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-4039.
---
Resolution: Fixed

Resolved in release 3.0.0-alpha2 (2019-07-30)

> ZookeeperDistributedLock may not release lock when unlock operation was 
> interrupted
> ---
>
> Key: KYLIN-4039
> URL: https://issues.apache.org/jira/browse/KYLIN-4039
> Project: Kylin
>  Issue Type: Bug
>Reporter: PENG Zhengshuai
>Assignee: PENG Zhengshuai
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> ZookeeperDistributedLock may hold the lock and not release it when the unlock 
> operation was interrupted.
> Because the unlock operation contains two steps: 
> 1. peekLock: get the owner of the lock
> 2. purgeLock: purge the lock if the owner of the lock is the current client.
> If the peekLock step is interrupted, the purgeLock step won't be executed. 
> Thus the lock won't be released.
> Meanwhile, the lock operation should also consider the interrupt cases.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Closed] (KYLIN-3981) Auto Merge Job failed to execute on windows

2019-07-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-3981.
---
Resolution: Fixed

Resolved in release 3.0.0-alpha2 (2019-07-30)

> Auto Merge Job failed to execute on windows
> ---
>
> Key: KYLIN-3981
> URL: https://issues.apache.org/jira/browse/KYLIN-3981
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.6.1
>Reporter: Na Zhai
>Assignee: Na Zhai
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> Auto Merge Job failed to execute on windows. It will throw the following 
> errors.
> {code:none}
> java.lang.IllegalStateException: Metadata uri : 
> C:\Users\NAD096~1.ZHA\AppData\Local\Temp\kylin_job_meta1467762575939435363\meta
>  is not recognized
>  at org.apache.kylin.common.KylinConfig.decideUriType(KylinConfig.java:211)
>  at 
> org.apache.kylin.common.KylinConfig.createInstanceFromUri(KylinConfig.java:221)
>  at 
> org.apache.kylin.engine.mr.common.JobRelatedMetaUtil.dumpResources(JobRelatedMetaUtil.java:68)
>  at 
> org.apache.kylin.engine.mr.common.JobRelatedMetaUtil.dumpAndUploadKylinPropsAndMetadata(JobRelatedMetaUtil.java:87)
>  at 
> org.apache.kylin.engine.mr.common.AbstractHadoopJob.attachSegmentsMetadataWithDict(AbstractHadoopJob.java:572)
>  at 
> org.apache.kylin.engine.mr.steps.MergeDictionaryJob.run(MergeDictionaryJob.java:104)
>  at 
> org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:131)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
>  at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>  at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
>  at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Suppressed: java.io.FileNotFoundException: File does not exist: 
> C:\Users\NAD096~1.ZHA\AppData\Local\Temp\kylin_job_meta1467762575939435363\meta
>  at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275)
>  at 
> org.apache.kylin.common.persistence.AutoDeleteDirectory.close(AutoDeleteDirectory.java:56)
>  at 
> org.apache.kylin.engine.mr.common.JobRelatedMetaUtil.dumpAndUploadKylinPropsAndMetadata(JobRelatedMetaUtil.java:103)
>  ... 10 more
> Caused by: java.lang.IllegalStateException: Metadata uri : 
> C:\Users\NAD096~1.ZHA\AppData\Local\Temp\kylin_job_meta1467762575939435363\meta
>  is not a valid REST URI address
>  at org.apache.kylin.common.KylinConfig.decideUriType(KylinConfig.java:208)
>  ... 13 more 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (KYLIN-3843) List kylin instances with their server mode on web

2019-07-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3843.
-
Resolution: Fixed

> List kylin instances with their server mode on web
> --
>
> Key: KYLIN-3843
> URL: https://issues.apache.org/jira/browse/KYLIN-3843
> Project: Kylin
>  Issue Type: New Feature
>  Components: REST Service, Web 
>Reporter: nichunen
>Assignee: Jiatao Tao
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> As Curator-based scheduler is available now, so Kylin can list all nodes with 
> the same metadata url.
> This task should include some rest apis to fetch nodes information on ZK, and 
> front page on System page to display the nodes information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (KYLIN-4106) Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct Columns”

2019-07-28 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-4106.
-
Resolution: Fixed

> Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct 
> Columns”
> --
>
> Key: KYLIN-4106
> URL: https://issues.apache.org/jira/browse/KYLIN-4106
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.6.1, v2.6.2
>Reporter: langdamao
>Assignee: langdamao
>Priority: Critical
>  Labels: easyfix
> Fix For: v2.6.4
>
>
> We got this error when Extract Fact Table Distinct Columns  @kylin 2.6.1
>  
> {code:java}
> Error: java.io.IOException: Illegal partition for 
> org.apache.kylin.engine.mr.steps.SelfDefineSortableKey@6b69761b (254)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1096)
> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.writeFieldValue(FactDistinctColumnsMapper.java:
> 281) at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:186)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}
> I've found the problem in the follow code in 
> *FactDistinctColumnsReducerMapping.java – engine-mr*
> {code:java}
> public int getReducerIdForCol(int colId, Object fieldValue) {
> int begin = colIdToReducerBeginId[colId];
> int span = colIdToReducerBeginId[colId + 1] - begin;
>  
> if (span == 1)
> return begin;
>  
> int hash = fieldValue == null ? 0 : fieldValue.hashCode();
> return begin + Math.abs(hash) % span;
> }
> {code}
> for the error rowkey it's begin=1, span=5 ，and we got hash=-2147483648 
> ，meanwhile Math.abs(-2147483648) return -2147483648 ,so for the above code it 
> return -2 ( which was 254 while unsigned).
> this will also cause problem bellow when  Function getReduerIdForCol return 
> -1 （when begin=1,span=3,hash= -2147483648) ,because value write to rowkey 
> reducer is empty_text ， but  No. -1 reducer need value text
> {code:java}
> Error: java.nio.BufferUnderflowException at 
> java.nio.Buffer.nextGetIndex(Buffer.java:500) 
> at java.nio.HeapByteBuffer.get(Heap.ByteBuffer.java:135)
> at org.apache.kylin.measure.hllc.HLLCounter.readRegisters(HLLCounter.java:327)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:145)
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:60)
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (KYLIN-4106) Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct Columns”

2019-07-27 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4106:
---

Assignee: langdamao

> Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct 
> Columns”
> --
>
> Key: KYLIN-4106
> URL: https://issues.apache.org/jira/browse/KYLIN-4106
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.6.1, v2.6.2
>Reporter: langdamao
>Assignee: langdamao
>Priority: Critical
>  Labels: easyfix
> Fix For: v2.6.4
>
>
> We got this error when Extract Fact Table Distinct Columns  @kylin 2.6.1
>  
> {code:java}
> Error: java.io.IOException: Illegal partition for 
> org.apache.kylin.engine.mr.steps.SelfDefineSortableKey@6b69761b (254)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1096)
> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.writeFieldValue(FactDistinctColumnsMapper.java:
> 281) at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:186)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}
> I've found the problem in the follow code in 
> *FactDistinctColumnsReducerMapping.java – engine-mr*
> {code:java}
> public int getReducerIdForCol(int colId, Object fieldValue) {
> int begin = colIdToReducerBeginId[colId];
> int span = colIdToReducerBeginId[colId + 1] - begin;
>  
> if (span == 1)
> return begin;
>  
> int hash = fieldValue == null ? 0 : fieldValue.hashCode();
> return begin + Math.abs(hash) % span;
> }
> {code}
> for the error rowkey it's begin=1, span=5 ，and we got hash=-2147483648 
> ，meanwhile Math.abs(-2147483648) return -2147483648 ,so for the above code it 
> return -2 ( which was 254 while unsigned).
> this will also cause problem bellow when  Function getReduerIdForCol return 
> -1 （when begin=1,span=3,hash= -2147483648) ,because value write to rowkey 
> reducer is empty_text ， but  No. -1 reducer need value text
> {code:java}
> Error: java.nio.BufferUnderflowException at 
> java.nio.Buffer.nextGetIndex(Buffer.java:500) 
> at java.nio.HeapByteBuffer.get(Heap.ByteBuffer.java:135)
> at org.apache.kylin.measure.hllc.HLLCounter.readRegisters(HLLCounter.java:327)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:145)
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:60)
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KYLIN-4106) Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct Columns”

2019-07-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894584#comment-16894584
 ] 

Shaofeng SHI commented on KYLIN-4106:
-

Oh, "-2147483648" is Integer#MIN_VALUE, in Math.abs() Java doc it said:
{code:java}
Note that if the argument is equal to the value of
* {@link Integer#MIN_VALUE}, the most negative representable
* {@code int} value, the result is that same value, which is
* negative.
{code}
So, this implementation didn't consider this case. Your patch is smart. Thank 
you!

> Illegal partition for SelfDefineSortableKey when “Extract Fact Table Distinct 
> Columns”
> --
>
> Key: KYLIN-4106
> URL: https://issues.apache.org/jira/browse/KYLIN-4106
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.6.1, v2.6.2
>Reporter: langdamao
>Priority: Critical
>  Labels: easyfix
> Fix For: v2.6.4
>
>
> We got this error when Extract Fact Table Distinct Columns  @kylin 2.6.1
>  
> {code:java}
> Error: java.io.IOException: Illegal partition for 
> org.apache.kylin.engine.mr.steps.SelfDefineSortableKey@6b69761b (254)
> at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1096)
> at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.writeFieldValue(FactDistinctColumnsMapper.java:
> 281) at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:186)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
> {code}
> I've found the problem in the follow code in 
> *FactDistinctColumnsReducerMapping.java – engine-mr*
> {code:java}
> public int getReducerIdForCol(int colId, Object fieldValue) {
> int begin = colIdToReducerBeginId[colId];
> int span = colIdToReducerBeginId[colId + 1] - begin;
>  
> if (span == 1)
> return begin;
>  
> int hash = fieldValue == null ? 0 : fieldValue.hashCode();
> return begin + Math.abs(hash) % span;
> }
> {code}
> for the error rowkey it's begin=1, span=5 ，and we got hash=-2147483648 
> ，meanwhile Math.abs(-2147483648) return -2147483648 ,so for the above code it 
> return -2 ( which was 254 while unsigned).
> this will also cause problem bellow when  Function getReduerIdForCol return 
> -1 （when begin=1,span=3,hash= -2147483648) ,because value write to rowkey 
> reducer is empty_text ， but  No. -1 reducer need value text
> {code:java}
> Error: java.nio.BufferUnderflowException at 
> java.nio.Buffer.nextGetIndex(Buffer.java:500) 
> at java.nio.HeapByteBuffer.get(Heap.ByteBuffer.java:135)
> at org.apache.kylin.measure.hllc.HLLCounter.readRegisters(HLLCounter.java:327)
> at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:145)
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsReducer.doReduce(FactDistinctColumnsReducer.java:60)
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KYLIN-4111) drop table failed with no valid privileges after KYLIN-3857

2019-07-27 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4111:

Fix Version/s: v2.6.4

> drop table failed with no valid privileges after KYLIN-3857
> ---
>
> Key: KYLIN-4111
> URL: https://issues.apache.org/jira/browse/KYLIN-4111
> Project: Kylin
>  Issue Type: Bug
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v2.6.4
>
>
> After KYLIN-3857, there will be quote ` around database and table.
> The drop table sql will be:
> {code:java}
> DROP TABLE IF EXISTS 
> `kylin_onebox.kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`;{code}
> Hive (1.2)with sentry will throw exception:
> {code:java}
> Error: Error while compiling statement: FAILED: HiveAccessControlException No 
> valid privileges
>  Required privileges for this query: 
> Server=server1->Db=`kylin_onebox->Table=kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`->action=drop;
> Query log: 
> http://zjy-hadoop-prc-ct14.bj:18201/log?qid=898c7878-a961-443d-b120-cca0e2667d15_f486bd16-4bbd-4014-a0a7-c2ebfdbe6668
>  (state=42000,code=4)
> {code}
> The reason is that hive identify the databse be `kylin_onebox and table be: 
> kylin_intermediate_kylin_sales_cube_7be84be1_a153_07c4_3ce6_270e8d99ff85`
> May be we can fix it in hive and sentry. Just create a jira to show this 
> problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KYLIN-4115) Always load KafkaConsumerProperties

2019-07-27 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4115:

Fix Version/s: v2.6.4

Good catch, thank you Zhixiong!

> Always load KafkaConsumerProperties
> ---
>
> Key: KYLIN-4115
> URL: https://issues.apache.org/jira/browse/KYLIN-4115
> Project: Kylin
>  Issue Type: Bug
>  Components: NRT Streaming
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
> Fix For: v2.6.4
>
>
> KafkaConsumerProperties can't override by conf/kylin-kafka-consumer.xml or 
> “kylin.source.kafka.config-override.”



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (KYLIN-4099) Using no blocking RDD unpersist in spark cubing job

2019-07-22 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4099:

Fix Version/s: v3.0.0

> Using no blocking RDD unpersist in spark cubing job 
> 
>
> Key: KYLIN-4099
> URL: https://issues.apache.org/jira/browse/KYLIN-4099
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v3.0.0
>
>
> By default, the unpersist operation of RDD in spark is blocking which may 
> cost a lot time and
> some times it may failed for some spark executors lost. 
> We can set blocking false to improve it.
> {code:java}
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> scala.concurrent.Await$.result(package.scala:190)
> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
> org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:127)
> org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1709)
> org.apache.spark.rdd.RDD.unpersist(RDD.scala:216)
> org.apache.spark.api.java.JavaPairRDD.unpersist(JavaPairRDD.scala:73)
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:204)
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
> org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:498)
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653){code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (KYLIN-4099) Using no blocking RDD unpersist in spark cubing job

2019-07-22 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890208#comment-16890208
 ] 

Shaofeng SHI commented on KYLIN-4099:
-

+1, good finding. We didn't aware there is such a method. Thank you!

> Using no blocking RDD unpersist in spark cubing job 
> 
>
> Key: KYLIN-4099
> URL: https://issues.apache.org/jira/browse/KYLIN-4099
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
>
> By default, the unpersist operation of RDD in spark is blocking which may 
> cost a lot time and
> some times it may failed for some spark executors lost. 
> We can set blocking false to improve it.
> {code:java}
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
> scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> scala.concurrent.Await$.result(package.scala:190)
> org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
> org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:127)
> org.apache.spark.SparkContext.unpersistRDD(SparkContext.scala:1709)
> org.apache.spark.rdd.RDD.unpersist(RDD.scala:216)
> org.apache.spark.api.java.JavaPairRDD.unpersist(JavaPairRDD.scala:73)
> org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:204)
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
> org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> java.lang.reflect.Method.invoke(Method.java:498)
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:653){code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (KYLIN-4063) Avoid repeatedly calling "string.toLowerCase" in TimedJsonStreamParser#parse

2019-07-01 Thread Shaofeng SHI (JIRA)

Shaofeng SHI created KYLIN-4063:
---

 Summary: Avoid repeatedly calling "string.toLowerCase" in 
TimedJsonStreamParser#parse
 Key: KYLIN-4063
 URL: https://issues.apache.org/jira/browse/KYLIN-4063
 Project: Kylin
  Issue Type: Improvement
  Components: NRT Streaming
Reporter: Shaofeng SHI


In TimedJsonStreamParser#parse, it has this:

 
{code:java}
for (TblColRef column : allColumns) {
final String columnName = column.getName().toLowerCase(Locale.ROOT);
if (populateDerivedTimeColumns(columnName, result, t) == false) {
result.add(getValueByKey(column, root));
}
}
{code}
 

As this method will be invoked for each message, and then for each column it 
will have a "toLowerCase(Locale.ROOT)", which is unnecessary, because the 
"allColumns" won't change.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-4062) Too many "if else" clause in PushDownRunnerJdbcImpl#toSqlType

2019-07-01 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4062:
---

Assignee: 王汝鹏

Rupeng, please go ahead. Thank you!

> Too many "if else" clause in PushDownRunnerJdbcImpl#toSqlType
> -
>
> Key: KYLIN-4062
> URL: https://issues.apache.org/jira/browse/KYLIN-4062
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Shaofeng SHI
>Assignee: 王汝鹏
>Priority: Minor
>
> In this method, it has 30 "if else" which is low efficient; Should use a 
> static Hashmap, then only need 1 check.
>  
> {code:java}
> if ("string".equalsIgnoreCase(type)) {
> return Types.VARCHAR;
> } else if ("varchar".equalsIgnoreCase(type)) {
> return Types.VARCHAR;
> } else if ("char".equalsIgnoreCase(type)) {
> return Types.CHAR;
> } else if
> ...{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3999) Enable dynamic column by default

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3999:
---

Assignee: haifeng wang

> Enable dynamic column by default
> 
>
> Key: KYLIN-3999
> URL: https://issues.apache.org/jira/browse/KYLIN-3999
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Reporter: Shaofeng SHI
>Assignee: haifeng wang
>Priority: Minor
> Fix For: v2.6.3
>
>
> More and more user expects to use "SUM(Case when)" feature, and got error. 
> The reason is the dynamic column is disabled by default. We should consider 
> to enable it by default:
>  
> kylin.query.enable-dynamic-column=true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-4004) Multi user groups table permission error

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-4004.
-
Resolution: Fixed

> Multi user groups table permission error
> 
>
> Key: KYLIN-4004
> URL: https://issues.apache.org/jira/browse/KYLIN-4004
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: haifeng wang
>Assignee: haifeng wang
>Priority: Major
> Fix For: v2.6.3
>
>
> When a user has multiple user groups, When collecting the blacklist of the 
> table, Using the union of the blacklists of each group



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-4004) Multi user groups table permission error

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4004:
---

Assignee: haifeng wang

> Multi user groups table permission error
> 
>
> Key: KYLIN-4004
> URL: https://issues.apache.org/jira/browse/KYLIN-4004
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: all
>Reporter: haifeng wang
>Assignee: haifeng wang
>Priority: Major
> Fix For: v2.6.3
>
>
> When a user has multiple user groups, When collecting the blacklist of the 
> table, Using the union of the blacklists of each group



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-4020) Fix_length rowkey encode without sepecified length can be saved but cause CreateHTable step failed

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4020:
---

Assignee: Yuzhang QIU
Priority: Minor  (was: Major)
 Summary: Fix_length rowkey encode without sepecified length can be saved 
but cause CreateHTable step failed  (was: fix_length rowkey encode without 
sepecified length can be saved but cause CreateHTable step failed)

> Fix_length rowkey encode without sepecified length can be saved but cause 
> CreateHTable step failed
> --
>
> Key: KYLIN-4020
> URL: https://issues.apache.org/jira/browse/KYLIN-4020
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Minor
> Fix For: v2.6.3
>
>
> Hi dear team:
> Just as title said.  
> Maybe there should have more strict check for advanced settings, I think.
> How do you think about this?
> If there already have same JIRA，please inform me and close this one.
>   
>  Best regards
>   
>  yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-4030) ResourceStore deleteResource by comparing timestamp may be failed to delete caused by time precision loose

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-4030.
-
Resolution: Fixed

> ResourceStore deleteResource by comparing timestamp may be failed to delete 
> caused by time precision loose
> --
>
> Key: KYLIN-4030
> URL: https://issues.apache.org/jira/browse/KYLIN-4030
> Project: Kylin
>  Issue Type: Bug
>Reporter: PENG Zhengshuai
>Assignee: PENG Zhengshuai
>Priority: Minor
> Fix For: v2.6.3
>
>
> In ResourceStore, the interface *deleteResourceImpl(String resPath, long 
> timestamp)* maybe failed to delete the resource because the resource 
> timestamp may lose the precision.
> For example, if we use *$KYLIN_HOME/bin/metastore.sh* to backup from 
> metastore to Local OS filesystem, and restore from local filesystem to 
> metastore. The resource timestamp will lose precision in Millis Second.
> An example here: 
> original timestamp: 1559564381
> after precision lost: 1559564000
> The fix design is to tolerate [0-999] ms difference when comparing the 
> timestamp



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-4023) Timestamp and date with timezone in jdbc

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4023:

Summary: Timestamp and date with timezone in jdbc  (was: timestamp and date 
with timezone in jdbc)

> Timestamp and date with timezone in jdbc
> 
>
> Key: KYLIN-4023
> URL: https://issues.apache.org/jira/browse/KYLIN-4023
> Project: Kylin
>  Issue Type: Improvement
>  Components: Driver - JDBC
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
> Fix For: v2.6.3
>
>
> query with timestamp or date type column
> the result with timezone in jdbc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-4022) Pushdown error "unrecognized column type: DECIMAL(xx,xx)"

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4022:

Summary: Pushdown error "unrecognized column type: DECIMAL(xx,xx)"  (was: 
when Adhoc Push Down then Unrecognized column type: DECIMAL(xx,xx))

> Pushdown error "unrecognized column type: DECIMAL(xx,xx)"
> -
>
> Key: KYLIN-4022
> URL: https://issues.apache.org/jira/browse/KYLIN-4022
> Project: Kylin
>  Issue Type: Improvement
>Reporter: jinguowei
>Assignee: jinguowei
>Priority: Major
> Fix For: v2.6.3
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-4062) Too many "if else" clause in PushDownRunnerJdbcImpl#toSqlType

2019-06-30 Thread Shaofeng SHI (JIRA)

Shaofeng SHI created KYLIN-4062:
---

 Summary: Too many "if else" clause in 
PushDownRunnerJdbcImpl#toSqlType
 Key: KYLIN-4062
 URL: https://issues.apache.org/jira/browse/KYLIN-4062
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Reporter: Shaofeng SHI


In this method, it has 30 "if else" which is low efficient; Should use a static 
Hashmap, then only need 1 check.

 
{code:java}
if ("string".equalsIgnoreCase(type)) {
return Types.VARCHAR;
} else if ("varchar".equalsIgnoreCase(type)) {
return Types.VARCHAR;
} else if ("char".equalsIgnoreCase(type)) {
return Types.CHAR;
} else if
...{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-30 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875735#comment-16875735
 ] 

Shaofeng SHI commented on KYLIN-4061:
-

It is known issue I think, with low priority. Because in OLAP scenarios, most 
queries are started from fact table. Of course, enhancement patches are always 
welcomed.

> Swap inner join's left side, right side table will get different result when 
> query
> --
>
> Key: KYLIN-4061
> URL: https://issues.apache.org/jira/browse/KYLIN-4061
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: weibin0516
>Priority: Major
> Attachments: failed.png, succeed.png
>
>
> When the left side table of inner join is a fact table and the right side 
> table is a lookup table, will query cube and get correct result. Sql is as 
> follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_SALES
>  INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
>  
> However，when swap the left and right side tables of the inner join will 
> failed due to no realization found. Sql is as follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_ACCOUNT
>  INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
> We know that the above two sql semantics are consistent and should return the 
> same result. 
>  I looked at the source code, kylin will use context.firstTableScan(assigned 
> in OLAPTableScan.implementOLAP) as the fact table, whether it is or not. The 
> fact table will be the key evidence for choosing realization later. So, in 
> the second sql Regard a lookup table as a fact table can not find 
> corresponding realization.
> Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-4024) Pushdown to presto: Unrecognized column type: INTEGER,TIME,VARBINARY

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4024:

Summary: Pushdown to presto: Unrecognized column type: 
INTEGER,TIME,VARBINARY  (was: when ad-hoc Push Down by presto engine  
Unrecognized column type: INTEGER,TIME,VARBINARY)

> Pushdown to presto: Unrecognized column type: INTEGER,TIME,VARBINARY
> 
>
> Key: KYLIN-4024
> URL: https://issues.apache.org/jira/browse/KYLIN-4024
> Project: Kylin
>  Issue Type: Improvement
>  Components: Query Engine
>Affects Versions: all, v3.0.0, v2.6.2
>Reporter: wangxiaojing
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.3
>
>
>  Hello ,
>     When do ad-hoc Push Down query by presto engine ,it throws Unrecognized 
> column type,like INTEGER,TIME,VARBINARY 。
>     The field of the table that appears in the hive is defined as an int 
> type. If the Presto query is used, the INTEGER type will be returned. 
> At this time, kylin ad-hoc push down will not recognize this type.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3832) Kylin Pushdown query not support postgresql

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3832:
---

Assignee: weibin0516

Weibin, please go ahead. Thank you!

> Kylin Pushdown query not support postgresql
> ---
>
> Key: KYLIN-3832
> URL: https://issues.apache.org/jira/browse/KYLIN-3832
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: hailin.huang
>Assignee: weibin0516
>Priority: Major
> Fix For: Future
>
>
> when I run pushdown to postgresql in my env, I encount the below exception.
> it seems that kylin need support more JDBC Driver, 
> PushDownRunnerJdbcImpl.class should be more general.
> 2019-02-26 16:12:53,168 ERROR [Query 207dcf77-7c14-8078-ea8b-79644a0c576d-48] 
> service.QueryService:989 : pushdown engine failed current query too
> java.sql.SQLException: Unrecognized column type: int8
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.toSqlType(PushDownRunnerJdbcImpl.java:260)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.extractColumnMeta(PushDownRunnerJdbcImpl.java:192)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.executeQuery(PushDownRunnerJdbcImpl.java:68)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownQuery(PushDownUtil.java:122)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownSelectQuery(PushDownUtil.java:69)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3832) Kylin pushdown to support postgresql

2019-06-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3832:

Fix Version/s: (was: Future)
   Issue Type: New Feature  (was: Bug)
  Summary: Kylin pushdown to support postgresql  (was: Kylin Pushdown 
query not support postgresql)

> Kylin pushdown to support postgresql
> 
>
> Key: KYLIN-3832
> URL: https://issues.apache.org/jira/browse/KYLIN-3832
> Project: Kylin
>  Issue Type: New Feature
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: hailin.huang
>Assignee: weibin0516
>Priority: Major
>
> when I run pushdown to postgresql in my env, I encount the below exception.
> it seems that kylin need support more JDBC Driver, 
> PushDownRunnerJdbcImpl.class should be more general.
> 2019-02-26 16:12:53,168 ERROR [Query 207dcf77-7c14-8078-ea8b-79644a0c576d-48] 
> service.QueryService:989 : pushdown engine failed current query too
> java.sql.SQLException: Unrecognized column type: int8
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.toSqlType(PushDownRunnerJdbcImpl.java:260)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.extractColumnMeta(PushDownRunnerJdbcImpl.java:192)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.executeQuery(PushDownRunnerJdbcImpl.java:68)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownQuery(PushDownUtil.java:122)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownSelectQuery(PushDownUtil.java:69)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3679) Fetch Kafka topic with Spark streaming

2019-06-29 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3679:
---

Assignee: weibin0516

Awesome! [~codingforfun] please go ahead, pull request to Kylin github is 
welcomed

> Fetch Kafka topic with Spark streaming
> --
>
> Key: KYLIN-3679
> URL: https://issues.apache.org/jira/browse/KYLIN-3679
> Project: Kylin
>  Issue Type: New Feature
>  Components: Spark Engine
>Reporter: Shaofeng SHI
>Assignee: weibin0516
>Priority: Major
>
> Now Kylin uses a MR job to fetch Kafka messages in parallel and then persist 
> to HDFS for subsequent processing. If user selects to use Spark engine, we 
> can use Spark streaming API to do this. Spark streaming can read the Kafka 
> message in a given offset range as a RDD, then it would be easy to process;
> https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html 
> With Spark streaming, Kylin can also easily connect with other data source 
> like Kinesis, Flume, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4057) autoMerge job can not stop

2019-06-25 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16872896#comment-16872896
 ] 

Shaofeng SHI commented on KYLIN-4057:
-

+1 we need fix this problem.

Can you try this: disable the cube, discard the job, and then enable the cube.

> autoMerge job can not stop
> --
>
> Key: KYLIN-4057
> URL: https://issues.apache.org/jira/browse/KYLIN-4057
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.6.2
>Reporter: chenchen
>Priority: Major
> Attachments: 5031561517158_.pic_hd.jpg
>
>
> In this version，I used DistributedScheduler。 
> In my case, A cube config autoMerge,   But the job that automatically merges 
> segments is wrong  ，I can't fix this error, I can only discard and drop the 
> job, but Kylin's monitoring mechanism creates the same new job.
> I want to ask if there is any way to fix this problem without banning cube. 
> The error is reported as follows.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-4015) Kylin build cube error at the "Build UHC Dictionary" step

2019-06-12 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-4015.
-
Resolution: Fixed

> Kylin build cube error at the "Build UHC Dictionary" step
> -
>
> Key: KYLIN-4015
> URL: https://issues.apache.org/jira/browse/KYLIN-4015
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.2
> Environment: Fusion Insight
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Major
>  Labels: easyfix
> Fix For: v2.6.3
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi All:
> We know, kylin builds dimension dictionary in kylin job client. But if a cube 
> has uhc dimensions, it will cost much more CPU and memory resources. Kylin 
> provides the ability to build uhc dictionary using the MR engine to reduce 
> the resource consumption of the build engine.
> But I find that the "Build UHC Dictionary" step build error. This step run 
> using MR engine. This is the error info from yarn:
> org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.io.IOException: 
> hdfs://hacluster/xxx.../xxx/fact_distinct_columns/xxx/FIELD_NAME.dic-r-1 
> not a SequenceFile.
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:)
> The reason of this problem is that the "Extract Fact Table Distinct " step 
> output two type of files:".dci" and ".rldict"; but the ".dci" file is not  a 
> sequence file, so the "Build UHC Dictionary" step should filter ".dci" file 
> when run with MR engine.
> I resolve this problem and will summit my code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-4039) ZookeeperDistributedLock may not release lock when unlock operation was interrupted

2019-06-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4039:
---

Assignee: PENG Zhengshuai

> ZookeeperDistributedLock may not release lock when unlock operation was 
> interrupted
> ---
>
> Key: KYLIN-4039
> URL: https://issues.apache.org/jira/browse/KYLIN-4039
> Project: Kylin
>  Issue Type: Bug
>Reporter: PENG Zhengshuai
>Assignee: PENG Zhengshuai
>Priority: Major
>
> ZookeeperDistributedLock may hold the lock and not release it when the unlock 
> operation was interrupted.
> Because the unlock operation contains two steps: 
> 1. peekLock: get the owner of the lock
> 2. purgeLock: purge the lock if the owner of the lock is the current client.
> If the peekLock step is interrupted, the purgeLock step won't be executed. 
> Thus the lock won't be released.
> Meanwhile, the lock operation should also consider the interrupt cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-2363) Prune cuboids by capping number of dimensions

2019-06-10 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16860574#comment-16860574
 ] 

Shaofeng SHI commented on KYLIN-2363:
-

No GUI for it I think. This is an advanced feature only open to expert :)

> Prune cuboids by capping number of dimensions
> -
>
> Key: KYLIN-2363
> URL: https://issues.apache.org/jira/browse/KYLIN-2363
> Project: Kylin
>  Issue Type: Improvement
>Reporter: fengYu
>Assignee: Roger Shi
>Priority: Major
> Fix For: v2.3.0
>
> Attachments: Dimension Capping.md
>
>
> the scene like this:
> I have 20+ dimensions, However the query will only use at most 5 dimensions 
> in all dimensions, so cuboid that contains 5+ dimensions(except base cuboid) 
> is useless.
> I think we can add a configuration in cube, which limit the max dimensions 
> that cuboid includes.
> What's more, we can config which level(number of dimension) need to 
> calculate. in above scene, we only calculate leve 1,2,3,4,5. and skip level 5+
> =
> The dimension capping is turned on by adding dim_cap property in 
> aggregation_groups definition.
> For example, the following aggregation group sets the dimension cap to 3. All 
> cuboids containing more than 3 dimensions  are skipped in this aggregation 
> group.
> {code:none}
> "aggregation_groups" : [ {
> "includes" : [ "PART_DT", "META_CATEG_NAME", "CATEG_LVL2_NAME", 
> "CATEG_LVL3_NAME", "LEAF_CATEG_ID", "LSTG_FORMAT_NAME", "LSTG_SITE_ID", 
> "OPS_USER_ID", "OPS_REGION", 
>"BUYER_ACCOUNT.ACCOUNT_BUYER_LEVEL", 
> "SELLER_ACCOUNT.ACCOUNT_SELLER_LEVEL", "BUYER_ACCOUNT.ACCOUNT_COUNTRY", 
> "SELLER_ACCOUNT.ACCOUNT_COUNTRY", "BUYER_COUNTRY.NAME", "SELLER_COUNTRY.NAME" 
> ],
> "select_rule" : {
>   "hierarchy_dims" : [ [ "META_CATEG_NAME", "CATEG_LVL2_NAME", 
> "CATEG_LVL3_NAME", "LEAF_CATEG_ID" ] ],
>   "mandatory_dims" : [ "PART_DT" ],
>   "joint_dims" : [ [ "BUYER_ACCOUNT.ACCOUNT_COUNTRY", 
> "BUYER_COUNTRY.NAME" ], [ "SELLER_ACCOUNT.ACCOUNT_COUNTRY", 
> "SELLER_COUNTRY.NAME" ],
>[ "BUYER_ACCOUNT.ACCOUNT_BUYER_LEVEL", 
> "SELLER_ACCOUNT.ACCOUNT_SELLER_LEVEL" ], [ "LSTG_FORMAT_NAME", "LSTG_SITE_ID" 
> ], [ "OPS_USER_ID", "OPS_REGION" ] ],
>  "dim_cap" : 3
> }
> } ]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-4035) Calculate column cardinality by using spark engine

2019-06-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-4035:
---

Assignee: Jack

Hi Jack, thanks for the input and the PR. How about the performance improvement 
when switching from MR to Spark?

> Calculate column cardinality by using spark engine
> --
>
> Key: KYLIN-4035
> URL: https://issues.apache.org/jira/browse/KYLIN-4035
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Affects Versions: Future
> Environment: kylin: master/3.0.0-alpha
> spark: 2.4.3
> hadoop: 2.6.5
>Reporter: Jack
>Assignee: Jack
>Priority: Minor
> Fix For: Future
>
>
> Kylin will calculate column cardinality when loading hive table. This stage 
> is only supported by MR engine without spark. I think spark engine should be 
> used in this stage because of the following:
> 1) Kylin users can choose which engine they apply when calculating column 
> cardinality;
> 2) Some good spark features(e.g. dynamic resource allocation) can be used; 
> 3) The code written in spark is simple.
> I finish this work and test ok. But "kylin.engin.spark-cardinality=true" 
> should be added in kylin.properties(default is false). Look forwards to 
> suggestions.
> Best regards. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (KYLIN-3968) Customized precision doesn't work in web

2019-06-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-3968.
---
Resolution: Fixed

> Customized precision doesn't work in web
> 
>
> Key: KYLIN-3968
> URL: https://issues.apache.org/jira/browse/KYLIN-3968
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Reporter: Jack
>Assignee: Jack
>Priority: Minor
> Fix For: v2.6.2
>
>
> In the cubeMeasures.js, It will withdraw precision and scale by using Regular 
> Expression. The scale parameter is ok, but precision use the magic number 19.
> So we fixed it, In cubeMeasures.js, around line 469:
> “var precision = 19;”  --> "var precision = returnValue[2] || 0;"
> and we test ok including building cube and querying when the column is 
> decimal(38,18).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (KYLIN-3968) Customized precision doesn't work in web

2019-06-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reopened KYLIN-3968:
-
  Assignee: Jack

> Customized precision doesn't work in web
> 
>
> Key: KYLIN-3968
> URL: https://issues.apache.org/jira/browse/KYLIN-3968
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Reporter: Jack
>Assignee: Jack
>Priority: Minor
> Fix For: v2.6.2
>
>
> In the cubeMeasures.js, It will withdraw precision and scale by using Regular 
> Expression. The scale parameter is ok, but precision use the magic number 19.
> So we fixed it, In cubeMeasures.js, around line 469:
> “var precision = 19;”  --> "var precision = returnValue[2] || 0;"
> and we test ok including building cube and querying when the column is 
> decimal(38,18).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4028) Speed up startup progress using cached dependency

2019-06-05 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16856713#comment-16856713
 ] 

Shaofeng SHI commented on KYLIN-4028:
-

awsome！thank you Temple!

> Speed up startup progress using cached dependency
> -
>
> Key: KYLIN-4028
> URL: https://issues.apache.org/jira/browse/KYLIN-4028
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Affects Versions: all
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
>
> The hive/hadoop/hbase dependencies are not volatile, and finding the 
> dependencies every time I start the Kylin server will slow down the startup 
> speed.
> So, if there are dependencies generated by previous running, we can use it to 
> start the server without finding the dependencies again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4028) Speed up startup progress using cached dependency

2019-06-03 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16854322#comment-16854322
 ] 

Shaofeng SHI commented on KYLIN-4028:
-

This is a good idea! It will accelerate the startup, and also minimal the 
impace when some services like Hive/HBase are inactive.

Temple, is there a way to clear the cache easily? In case some times the 
classpath or path got changed in the client side.

> Speed up startup progress using cached dependency
> -
>
> Key: KYLIN-4028
> URL: https://issues.apache.org/jira/browse/KYLIN-4028
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Affects Versions: all
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
>
> The hive/hadoop/hbase dependencies are not volatile, and finding the 
> dependencies every time I start the Kylin server will slow down the startup 
> speed.
> So, if there are dependencies generated by previous running, we can use it to 
> start the server without finding the dependencies again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3997) Add a health check job of Kylin

2019-06-01 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853855#comment-16853855
 ] 

Shaofeng SHI commented on KYLIN-3997:
-

+1 good feature for Kylin administrators!

> Add a health check job of Kylin
> ---
>
> Key: KYLIN-3997
> URL: https://issues.apache.org/jira/browse/KYLIN-3997
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
>
> Kylin has many inner meta data and outer dependencies. There may be 
> inconsistent for bugs or failures. It's better to have a a health check job 
> to find these inconsistent issues in advance。
> The inconsistent issues we found in our clusters are followings
>  * {color:#808080}the cubeid data not exist for cube merging{color}
>  * {color:#808080}hbase table not exist or online for a segment{color}
>  * {color:#808080}there are holes in cube segments(The build of some days 
> failed, but user not found it){color}
>  * {color:#808080}Too many segment(hbase tables){color}
>  * {color:#808080}metadata of stale segment  left in cube{color}
>  * {color:#808080}Some cubes have not be updated/built for a long time{color}
>  * {color:#808080}Some  important parameters are no set in cube desc{color}
>  * {color:#808080}...{color}
>  Suggestions are welcomed, thanks~



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3994) StorageCleanupJob may delete data of newly built segment because of cube cache in CubeManager

2019-05-30 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851656#comment-16851656
 ] 

Shaofeng SHI commented on KYLIN-3994:
-

I agree with Zhengshuai; The current PR only narrow down the possibility of 
reading a dirty data, not solve the problem totally.

> StorageCleanupJob may delete data of newly built segment because of cube 
> cache in CubeManager
> -
>
> Key: KYLIN-3994
> URL: https://issues.apache.org/jira/browse/KYLIN-3994
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.5.2
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v2.6.3
>
>
> In our production cluster, we found that the cube id data of a new-built 
> segment is deleted by the StorageCleanupJob.
> After checking the code of cleanUnusedHdfsFiles in StorageCleanupJob, we 
> found that there is  a bug here:  CubeManager read all cube meta in 
> initiation and cache it for later
> listAllCubes operations, the metadata will be out of data after listing the 
> hdfs working dir.
> So the working directory of  a finished job may be deleted  unexpectedly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3994) StorageCleanupJob may delete data of newly built segment because of cube cache in CubeManager

2019-05-30 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3994:

Fix Version/s: v2.6.3
  Summary: StorageCleanupJob may delete data of newly built segment 
because of cube cache in CubeManager  (was: StorageCleanupJob may delete cube 
id data of new built segment because of cube cache in CubeManager)

> StorageCleanupJob may delete data of newly built segment because of cube 
> cache in CubeManager
> -
>
> Key: KYLIN-3994
> URL: https://issues.apache.org/jira/browse/KYLIN-3994
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.5.2
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v2.6.3
>
>
> In our production cluster, we found that the cube id data of a new-built 
> segment is deleted by the StorageCleanupJob.
> After checking the code of cleanUnusedHdfsFiles in StorageCleanupJob, we 
> found that there is  a bug here:  CubeManager read all cube meta in 
> initiation and cache it for later
> listAllCubes operations, the metadata will be out of data after listing the 
> hdfs working dir.
> So the working directory of  a finished job may be deleted  unexpectedly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4020) fix_length rowkey encode without sepecified length can be saved but cause CreateHTable step failed

2019-05-28 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850301#comment-16850301
 ] 

Shaofeng SHI commented on KYLIN-4020:
-

Hi Yuzhang, there is no previous reporting; Please go ahead. Thank you!

> fix_length rowkey encode without sepecified length can be saved but cause 
> CreateHTable step failed
> --
>
> Key: KYLIN-4020
> URL: https://issues.apache.org/jira/browse/KYLIN-4020
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Priority: Major
>
> Hi dear team:
> Just as title said.  
> Maybe there should have more strict check for advanced settings, I think.
> How do you think about this?
> If there already have same JIRA，please inform me and close this one.
>   
>  Best regards
>   
>  yuzhang



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-4015) Kylin build cube error at the "Build UHC Dictionary" step

2019-05-28 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-4015:

Fix Version/s: v2.6.3

> Kylin build cube error at the "Build UHC Dictionary" step
> -
>
> Key: KYLIN-4015
> URL: https://issues.apache.org/jira/browse/KYLIN-4015
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.2
> Environment: Fusion Insight
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Major
>  Labels: easyfix
> Fix For: v2.6.3
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi All:
> We know, kylin builds dimension dictionary in kylin job client. But if a cube 
> has uhc dimensions, it will cost much more CPU and memory resources. Kylin 
> provides the ability to build uhc dictionary using the MR engine to reduce 
> the resource consumption of the build engine.
> But I find that the "Build UHC Dictionary" step build error. This step run 
> using MR engine. This is the error info from yarn:
> org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.io.IOException: 
> hdfs://hacluster/xxx.../xxx/fact_distinct_columns/xxx/FIELD_NAME.dic-r-1 
> not a SequenceFile.
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:)
> The reason of this problem is that the "Extract Fact Table Distinct " step 
> output two type of files:".dci" and ".rldict"; but the ".dci" file is not  a 
> sequence file, so the "Build UHC Dictionary" step should filter ".dci" file 
> when run with MR engine.
> I resolve this problem and will summit my code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4015) Kylin build cube error at the "Build UHC Dictionary" step

2019-05-28 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850291#comment-16850291
 ] 

Shaofeng SHI commented on KYLIN-4015:
-

I see, the ".dci" file was generated for keeping each dimension's min/max info, 
which is introduced in KYLIN-3370, the files should be excluded.

> Kylin build cube error at the "Build UHC Dictionary" step
> -
>
> Key: KYLIN-4015
> URL: https://issues.apache.org/jira/browse/KYLIN-4015
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.5.2
> Environment: Fusion Insight
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Major
>  Labels: easyfix
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Hi All:
> We know, kylin builds dimension dictionary in kylin job client. But if a cube 
> has uhc dimensions, it will cost much more CPU and memory resources. Kylin 
> provides the ability to build uhc dictionary using the MR engine to reduce 
> the resource consumption of the build engine.
> But I find that the "Build UHC Dictionary" step build error. This step run 
> using MR engine. This is the error info from yarn:
> org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.io.IOException: 
> hdfs://hacluster/xxx.../xxx/fact_distinct_columns/xxx/FIELD_NAME.dic-r-1 
> not a SequenceFile.
>  at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:)
> The reason of this problem is that the "Extract Fact Table Distinct " step 
> output two type of files:".dci" and ".rldict"; but the ".dci" file is not  a 
> sequence file, so the "Build UHC Dictionary" step should filter ".dci" file 
> when run with MR engine.
> I resolve this problem and will summit my code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3845) Kylin build error If the Kafka data source lacks selected dimensions or metrics in the kylin stream build.

2019-05-26 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3845:
---

 Assignee: zhao jintao
Fix Version/s: (was: Future)
   v3.0.0
  Component/s: NRT Streaming

> Kylin build error If the Kafka data source lacks selected dimensions or 
> metrics in the kylin stream build.
> --
>
> Key: KYLIN-3845
> URL: https://issues.apache.org/jira/browse/KYLIN-3845
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine, NRT Streaming
>Affects Versions: v2.5.2
> Environment: Fusion Insight
>Reporter: zhao jintao
>Assignee: zhao jintao
>Priority: Major
>  Labels: easyfix
> Fix For: v3.0.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi dear team:
> I'm developing OLAP Platform based on Kylin2.5.2. During my work, I build a 
> streaming cube from Kafka source using kafka demo.
> In my streaming project, I set country、currency as dimensions and userId as 
> metrics. But the cube build failed in 3rd step("Extract Fact Table Distinct 
> Columns"). The exception is java.lang.ArrayIndexOutOfBoundsException.
> This is logs:
> 2019-03-02 14:21:01,492 INFO [main] org.apache.kylin.engine.mr.KylinReducer: 
> Do cleanup, available memory: 1334m
> 2019-03-02 14:21:01,492 INFO [main] org.apache.kylin.engine.mr.KylinReducer: 
> Total rows: 127
> 2019-03-02 14:21:01,492 INFO [main] org.apache.hadoop.mapred.MapTask: 
> Finished spill 0
> 2019-03-02 14:21:01,492 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child: java.lang.ArrayIndexOutOfBoundsException:2
> 2019-03-02 14:21:01,492 INFO [main] org.apache.kylin.engine.mr.KylinReducer: 
> Do cleanup, available memory: 1334m
>  at 
> org.apache.kylin.engine.mr.steps.FactDistinctColumnsMapper.doMap(FactDistinctColumnsMapper.java:177)
>  at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
>  at org.apache.hadoop.mapreduce.Mapper.run(MapperTask.java:146)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:187)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1781)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java;180)
>  
> Then I find that in Kafka datasource, some streaming data lack the userId 
> column. Most of the streaming data(contry, currency,userId) is 
> ("China","CNY","843c4d");but a small amount of data lack userId, some data is 
> ("China","CNY"). so when run the 3rd step("Extract Fact Table Distinct 
> Columns"),MR engine will throw exception if the streaming data lack userId.
> The I check the source of Kylin, FactDistinctColumnsMapper.java:
> public void doMap(KEYIN key, Object record, Context context) throws 
> IOException, InterruptedException {
>  Collection rowCollection = 
> flatTableInputFormat.parseMapperInput(record);
> for (String[] row : rowCollection) {
>  context.getCounter(RawDataCounter.BYTES).increment(countSizeInBytes(row));
>  for (int i = 0; i < allCols.size(); i++) {
>  String fieldValue = row[columnIndex[i]];
>  if (fieldValue == null)
>  continue;
> final DataType type = allCols.get(i).getType();
>  ...
> I find that columnIndex[i] is equal with the size of row if the streaming 
> data lack one column. So the row[columnIndex[i]] will throw the 
> ArrayIndexOutOfBoundsException. So I change this code, check the 
> columnIndex[i] and the size of row. If columnIndex[i] is equal with or larger 
> than the size of row, I set fieldValue empty value. And After I change my 
> code， the 3rd step("Extract Fact Table Distinct Columns") will run success.
> Those are what I found, which will cause problem for developers.
> How do you think?
> Best regard
> jintao



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (KYLIN-1210) Allowing segment overlap to solve streaming data completeness problem

2019-05-22 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-1210.
---
   Resolution: Fixed
Fix Version/s: v1.6.0

The problem was based on Kylin 1.5, it has already been solved in Kylin v1.6.0 
with the new NRT streaming.

> Allowing segment overlap to solve streaming data completeness problem
> -
>
> Key: KYLIN-1210
> URL: https://issues.apache.org/jira/browse/KYLIN-1210
> Project: Kylin
>  Issue Type: Improvement
>Reporter: hongbin ma
>Assignee: hongbin ma
>Priority: Major
> Fix For: v1.6.0
>
>
> Previously cube segments in one cube are not allowed to overlap with each 
> other. This constraint to more intuitive and simple to maintain. This 
> constraint makes cube segments immutable, which is acceptable in batch cubing 
> scenarios.
> In streaming cubing scenarios, however, there may exist late coming data due 
> to upstream latencies. It's best we can still consume those late data rather 
> than simply ignore it. To accomplish this we have to relief the above 
> constraint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-4011) Kyling grouping function

2019-05-22 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845946#comment-16845946
 ] 

Shaofeng SHI commented on KYLIN-4011:
-

Does this help? [https://kylin.apache.org/blog/2016/11/16/window-function/]

> Kyling grouping function
> 
>
> Key: KYLIN-4011
> URL: https://issues.apache.org/jira/browse/KYLIN-4011
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.2
>Reporter: tag
>Priority: Major
>
> {{```}}
> {{select dim1, case grouping(dim2) when 1 then 'ALL' else dim2 end, sum(col) 
> as metric1 from table group by grouping sets((dim1, dim2), (dim1));}}
> {{```}}
> {{`case grouping(dim2) when 2 then 'All' else dim2 end` can work  in version 
> v2.5.2, but invalid in version v2.6.2. }}
> {{How to query aggregate by grouping sets?}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (KYLIN-3999) Enable dynamic column by default

2019-05-10 Thread Shaofeng SHI (JIRA)

Shaofeng SHI created KYLIN-3999:
---

 Summary: Enable dynamic column by default
 Key: KYLIN-3999
 URL: https://issues.apache.org/jira/browse/KYLIN-3999
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Reporter: Shaofeng SHI


More and more user expects to use "SUM(Case when)" feature, and got error. The 
reason is the dynamic column is disabled by default. We should consider to 
enable it by default:

 

kylin.query.enable-dynamic-column=true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3988) Weighted Average does not work on cube

2019-04-29 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3988.
-
Resolution: Invalid

> Weighted Average does not work on cube
> --
>
> Key: KYLIN-3988
> URL: https://issues.apache.org/jira/browse/KYLIN-3988
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - ODBC
>Affects Versions: v2.6.1
>Reporter: Anoop Krishnaswamy
>Priority: Critical
>
> When we try to get aggregate over multiplication of 2 metrics it throws an 
> error:
> That both of the two sides of the BinaryTupleExpression own columns is not 
> supported for * while executing SQL: "select 
> sum(riskscoreinitial*cyc_xxx_balanceAdb), cyc_xxx_cyclesdelinquent from 
> fct_profit_table_monthly group by cyc_xxx_cyclesdelinquent LIMIT 5"
>  
> My query looks something like this :
>  
> select sum(*),  from  group by 
> Any guidance will help



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3967) sum along with case expression does not work in query

2019-04-29 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829174#comment-16829174
 ] 

Shaofeng SHI commented on KYLIN-3967:
-

Hi hejian, "derived_column" is a "virtual" dimension which depends on the 
hosting dimension (the foreign key); Filterings on derived dimension can not be 
pushed down to storage engine, so there is some limitation for its use. In this 
case, please make "drived_dimension" as a normal dimension.

> sum along with case expression does not work in query
> -
>
> Key: KYLIN-3967
> URL: https://issues.apache.org/jira/browse/KYLIN-3967
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Gladson Vas
>Priority: Blocker
> Attachments: notworkingcase.jpg, workingcase.jpg
>
>
> When i try to run a query with a sum case expression combination,
> eg: select sum(case when col1<0 then 0 else col1 end ) from table 
> i get the following error:
> No realization found for OLAPContext, CUBE_UNMATCHED_AGGREGATION[FunctionDesc 
> [expression=SUM, parameter=CASE(<($8, 0), 0, $8), returnType=null]], 
> rel#36838:OLAPTableScan.OLAP.[](table=[DEFAULT, table],ctx=,fields=[0, 1, 2, 
> 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
> 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 
> 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
> 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
> 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 
> 100, 101, 102, 103, 104, 105, 106]) while executing SQL: "select sum (case 
> when col1 <0 then 0 else col1 end ) from table LIMIT 5"
>  
> is there any way to support this sum case expression in the query engine?
> Also I get the same error when the sum operation is done on a column derived 
> from a case expression in a subquery.
> eg: select sum(a.col1) from (select case when col1<0 then 0 else col1 end as 
> col1 from table) a
> Thanks,
> Gladson
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3967) sum along with case expression does not work in query

2019-04-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827812#comment-16827812
 ] 

Shaofeng SHI commented on KYLIN-3967:
-

I tested it in Kylin 2.6. This feature was not enabled by default, maybe you 
didn't change the setting as above?

> sum along with case expression does not work in query
> -
>
> Key: KYLIN-3967
> URL: https://issues.apache.org/jira/browse/KYLIN-3967
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Gladson Vas
>Priority: Blocker
> Attachments: notworkingcase.jpg, workingcase.jpg
>
>
> When i try to run a query with a sum case expression combination,
> eg: select sum(case when col1<0 then 0 else col1 end ) from table 
> i get the following error:
> No realization found for OLAPContext, CUBE_UNMATCHED_AGGREGATION[FunctionDesc 
> [expression=SUM, parameter=CASE(<($8, 0), 0, $8), returnType=null]], 
> rel#36838:OLAPTableScan.OLAP.[](table=[DEFAULT, table],ctx=,fields=[0, 1, 2, 
> 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
> 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 
> 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
> 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
> 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 
> 100, 101, 102, 103, 104, 105, 106]) while executing SQL: "select sum (case 
> when col1 <0 then 0 else col1 end ) from table LIMIT 5"
>  
> is there any way to support this sum case expression in the query engine?
> Also I get the same error when the sum operation is done on a column derived 
> from a case expression in a subquery.
> eg: select sum(a.col1) from (select case when col1<0 then 0 else col1 end as 
> col1 from table) a
> Thanks,
> Gladson
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3965) When using DriverManager - No suitable driver found for jdbc:kylin://

2019-04-27 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3965:

Fix Version/s: v2.6.2

> When using DriverManager - No suitable driver found for jdbc:kylin://
> -
>
> Key: KYLIN-3965
> URL: https://issues.apache.org/jira/browse/KYLIN-3965
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - JDBC
>Affects Versions: v2.6.1
>Reporter: Alexander
>Assignee: Alexander
>Priority: Minor
> Fix For: v2.6.2
>
> Attachments: KYLIN-3965.master.001.patch
>
>
> Caused by: java.sql.SQLException: No suitable driver found for jdbc:kylin://
>  
> This is because META-INF/services/java.sql.Driver got incorrect name
> org.apache.calcite.avatica.remote.Driver



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3989) Invalid temporary table path for kylin_metadata

2019-04-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827630#comment-16827630
 ] 

Shaofeng SHI commented on KYLIN-3989:
-

Hello Frederic, Apache Kylin doesn't support MapR out-of-box. You can contact 
MapR or Kyligence to get support as they have the joint solution.

> Invalid temporary table path for kylin_metadata
> ---
>
> Key: KYLIN-3989
> URL: https://issues.apache.org/jira/browse/KYLIN-3989
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.5.2
> Environment: MapR 6.1
> Hive 2.3
>Reporter: Frederic Souchu
>Priority: Major
>
> How to reproduce:
>  * create a mode
>  * define a cube
>  * build the cube
> The cube building will fail with the following Hive logs:
> {code:java}
> USE default;
> No rows affected (0.067 seconds)
> 0: jdbc:hive2://x.com>
> 0: jdbc:hive2://x.com> DROP TABLE IF EXISTS 
> kylin_intermediate_txn_cube_99023bdd_79e8_1186_2480_28b1d352d09e;
> No rows affected (0.023 seconds)
> 0: jdbc:hive2://x.com> CREATE EXTERNAL TABLE IF NOT EXISTS 
> kylin_intermediate_txn_cube_99023bdd_79e8_1186_2480_28b1d352d09
> e
> . . . . . . . . . . . . . . . . . . . . . . .> (
> . . . . . . . . . . . . . . . . . . . . . . .> PAYMENTS_GLOBALMERCHANTUID int
> . . . . . . . . . . . . . . . . . . . . . . .> )
> . . . . . . . . . . . . . . . . . . . . . . .> STORED AS SEQUENCEFILE
> . . . . . . . . . . . . . . . . . . . . . . .> LOCATION 
> 'maprfs:///apps/kylin_metadata/kylin-15b76d79-29ce-5782-d84e-bd33c305fc6f/kylin_intermedia
> te_txn_cube_99023bdd_79e8_1186_2480_28b1d352d09e';
> Error: org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Got exception: 
> java.io.IOException Error: Not a directory(20), file: 
> kylin_intermediate_txn_cube_99023bdd_79e8_1186_2480_28b1d352d09e, user name: 
> mapr, ID: 5000)
> {code}
> using metastore.sh to list content gives:
> {code}
> 2019-04-26 15:25:54,231 INFO  [main] common.KylinConfig:100 : Loading 
> kylin-defaults.properties from 
> file:/opt/apache-kylin-2.5.2-bin-hbase1x/tool/kylin-tool-2.5.2.jar!/kylin-defaults.properties
> 2019-04-26 15:25:54,257 DEBUG [main] common.KylinConfig:327 : KYLIN_CONF 
> property was not set, will seek KYLIN_HOME env variable
> 2019-04-26 15:25:54,263 INFO  [main] common.KylinConfig:135 : Initialized a 
> new KylinConfig from getInstanceFromEnv : 453523494
> 2019-04-26 15:25:54,376 INFO  [main] persistence.ResourceStore:88 : Using 
> metadata url /apps/kylin_metadata@hbase for resource store
> 2019-04-26 15:25:55,684 DEBUG [main] hbase.HBaseConnection:180 : Using the 
> working dir FS for HBase: maprfs:///
> 2019-04-26 15:25:55,684 INFO  [main] hbase.HBaseConnection:257 : connection 
> is null or closed, creating a new one
> 2019-04-26 15:25:55,719 INFO  [main] client.ConnectionFactory:272 : 
> ConnectionFactory receives mapr.hbase.default.db(maprdb), set 
> clusterType(MAPR_ONLY), user(mapr), hbase_admin_connect_at_construction(false)
> 2019-04-26 15:25:55,899 DEBUG [main] hbase.HBaseConnection:306 : HTable 
> '/apps/kylin_metadata' already exists
> null
> 2019-04-26 15:25:56,400 INFO  [close-hbase-conn] hbase.HBaseConnection:136 : 
> Closing HBase connections...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (KYLIN-3988) Weighted Average does not work on cube

2019-04-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827626#comment-16827626
 ] 

Shaofeng SHI edited comment on KYLIN-3988 at 4/27/19 2:40 PM:
--

Hello Anoop, sum(x*y) couldn't be translated to sum(X) and sum(Y) with any 
operator, so it is not supported by Kylin, unless you define a column "z" = 
"x*y" in the source table, and then define "sum(z)" as a measure. If you're not 
willing to add this column in your hive table, you can define it in a hive 
view, and then use the view as the cube fact table.


was (Author: shaofengshi):
Hello Anoop, sum(x*y) couldn't be translated to sum(x) and sum(y) with any 
operator, so it is not supported by Kylin, unless you define a column "z" = 
"x*y" in the source table, and then define "sum(z)" as a measure. If you're not 
willing to add this column in your hive table, you can define it in a hive 
view, and then use the view as the cube fact table.

> Weighted Average does not work on cube
> --
>
> Key: KYLIN-3988
> URL: https://issues.apache.org/jira/browse/KYLIN-3988
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - ODBC
>Affects Versions: v2.6.1
>Reporter: Anoop Krishnaswamy
>Priority: Critical
>
> When we try to get aggregate over multiplication of 2 metrics it throws an 
> error:
> That both of the two sides of the BinaryTupleExpression own columns is not 
> supported for * while executing SQL: "select 
> sum(riskscoreinitial*cyc_xxx_balanceAdb), cyc_xxx_cyclesdelinquent from 
> fct_profit_table_monthly group by cyc_xxx_cyclesdelinquent LIMIT 5"
>  
> My query looks something like this :
>  
> select sum(*),  from  group by 
> Any guidance will help



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3988) Weighted Average does not work on cube

2019-04-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827626#comment-16827626
 ] 

Shaofeng SHI commented on KYLIN-3988:
-

Hello Anoop, sum(x*y) couldn't be translated to sum(x) and sum(y) with any 
operator, so it is not supported by Kylin, unless you define a column "z" = 
"x*y" in the source table, and then define "sum(z)" as a measure. If you're not 
willing to add this column in your hive table, you can define it in a hive 
view, and then use the view as the cube fact table.

> Weighted Average does not work on cube
> --
>
> Key: KYLIN-3988
> URL: https://issues.apache.org/jira/browse/KYLIN-3988
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - ODBC
>Affects Versions: v2.6.1
>Reporter: Anoop Krishnaswamy
>Priority: Critical
>
> When we try to get aggregate over multiplication of 2 metrics it throws an 
> error:
> That both of the two sides of the BinaryTupleExpression own columns is not 
> supported for * while executing SQL: "select 
> sum(riskscoreinitial*cyc_xxx_balanceAdb), cyc_xxx_cyclesdelinquent from 
> fct_profit_table_monthly group by cyc_xxx_cyclesdelinquent LIMIT 5"
>  
> My query looks something like this :
>  
> select sum(*),  from  group by 
> Any guidance will help



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3964) segment overlapped, the status is "NEW", last_build_job_id is null, can not refresh/delete/build this segment

2019-04-27 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827624#comment-16827624
 ] 

Shaofeng SHI commented on KYLIN-3964:
-

[~hejian999] you can take a look at 

https://issues.apache.org/jira/browse/KYLIN-3449

It will allow delete an orphan segment (only segment, no job), which is in 
Kylin 2.5.

> segment overlapped, the status is "NEW", last_build_job_id is null, can not  
> refresh/delete/build this segment
> --
>
> Key: KYLIN-3964
> URL: https://issues.apache.org/jira/browse/KYLIN-3964
> Project: Kylin
>  Issue Type: Bug
> Environment: kylin-2.4.0
>Reporter: hejian
>Priority: Major
> Attachments: image-2019-04-22-10-07-06-737.png, 
> image-2019-04-23-16-51-34-805.png, image-2019-04-23-16-52-55-756.png
>
>
> *Any action* *involved* *this segment can not be excute by requesting api due 
> to no job_id provided.*
> !image-2019-04-22-10-07-06-737.png!
> merge/refresh/rebuild/delete does not work 
> !image-2019-04-23-16-51-34-805.png!!image-2019-04-23-16-52-55-756.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3967) sum along with case expression does not work in query

2019-04-26 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826833#comment-16826833
 ] 

Shaofeng SHI commented on KYLIN-3967:
-

Interesting, I did a small change (move the "WHERE  trans_id > 10" to the inner 
sub query) to the "not working" query, it can return result:

 

 
{code:java}
SELECT Sum(CASE
  WHEN trans_id_alias IN( 11, 12, 13 ) THEN price_alias
ELSE 0
  END) AS price_case_sum,
 part_dt_alias
FROM (SELECT price_alias,
  trans_id_alias,
  part_dt_alias
  FROM (SELECT price AS price_alias,
trans_id AS trans_id_alias,
part_dt AS part_dt_alias
FROM kylin_sales 
  WHERE trans_id > 10) as yy
 ) as xx
GROUP BY part_dt_alias
order by price_case_sum desc
{code}
 

> sum along with case expression does not work in query
> -
>
> Key: KYLIN-3967
> URL: https://issues.apache.org/jira/browse/KYLIN-3967
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Gladson Vas
>Priority: Blocker
> Attachments: notworkingcase.jpg, workingcase.jpg
>
>
> When i try to run a query with a sum case expression combination,
> eg: select sum(case when col1<0 then 0 else col1 end ) from table 
> i get the following error:
> No realization found for OLAPContext, CUBE_UNMATCHED_AGGREGATION[FunctionDesc 
> [expression=SUM, parameter=CASE(<($8, 0), 0, $8), returnType=null]], 
> rel#36838:OLAPTableScan.OLAP.[](table=[DEFAULT, table],ctx=,fields=[0, 1, 2, 
> 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
> 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 
> 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
> 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
> 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 
> 100, 101, 102, 103, 104, 105, 106]) while executing SQL: "select sum (case 
> when col1 <0 then 0 else col1 end ) from table LIMIT 5"
>  
> is there any way to support this sum case expression in the query engine?
> Also I get the same error when the sum operation is done on a column derived 
> from a case expression in a subquery.
> eg: select sum(a.col1) from (select case when col1<0 then 0 else col1 end as 
> col1 from table) a
> Thanks,
> Gladson
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3967) sum along with case expression does not work in query

2019-04-24 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824851#comment-16824851
 ] 

Shaofeng SHI commented on KYLIN-3967:
-

is "sum(col2)" a pre-defined measure in the cube? 

> sum along with case expression does not work in query
> -
>
> Key: KYLIN-3967
> URL: https://issues.apache.org/jira/browse/KYLIN-3967
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Gladson Vas
>Priority: Blocker
>
> When i try to run a query with a sum case expression combination,
> eg: select sum(case when col1<0 then 0 else col1 end ) from table 
> i get the following error:
> No realization found for OLAPContext, CUBE_UNMATCHED_AGGREGATION[FunctionDesc 
> [expression=SUM, parameter=CASE(<($8, 0), 0, $8), returnType=null]], 
> rel#36838:OLAPTableScan.OLAP.[](table=[DEFAULT, table],ctx=,fields=[0, 1, 2, 
> 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
> 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 
> 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
> 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
> 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 
> 100, 101, 102, 103, 104, 105, 106]) while executing SQL: "select sum (case 
> when col1 <0 then 0 else col1 end ) from table LIMIT 5"
>  
> is there any way to support this sum case expression in the query engine?
> Also I get the same error when the sum operation is done on a column derived 
> from a case expression in a subquery.
> eg: select sum(a.col1) from (select case when col1<0 then 0 else col1 end as 
> col1 from table) a
> Thanks,
> Gladson
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3975) Can kylin accelerate query speed for natural week or natural month report?

2019-04-23 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824714#comment-16824714
 ] 

Shaofeng SHI commented on KYLIN-3975:
-

Hi Jintao,

 

Let me try to understand: say a cube has three segments:

seg1: [2019-01-20 to 2019-02-01)

seg2:  [2019-02-01 to 2019-03-01)

seg3: [2019-03-01 to 2019-03-10)

 

Assume there is a query:

select dim1, dim2, sum(x), count(distinct y) from fact_table where dt > 
'2019-01-25' and dt < '2019-03-10'

For this query, Kylin will scan all the three segments by checking the 
partition date's max/min value, and the selected cuboid will have "dt"; If 
'month'-'week'-'dt' is defined as a hierarchy, the cuboid will have all of 
them. In this case, the cuboid will be "dim1+dim2+month+week+dt"

 

Here we can see, although the data in these segments have been merged to month 
level, they won't be used because the query condition is on "dt". So the 
performance is not that perfect as we expected.

A potential optimization is, if a segment is totally in the partition date 
scope (in this case, seg2 and seg3), and the partition date is only used as a 
filtering condition (not in group by), Kylin can change the execution plan to 
use the cuboid that has no "dt". In this case it will be optimized to 
"dim1+dim2", whose size is much smaller than the previous one, and the query 
performance can be much efficient than before as the aggregation has already 
been done in cube.

 

Is this what you want to discuss? or any better idea? Thanks.

 

> Can kylin accelerate  query speed for natural week or natural month report?
> ---
>
> Key: KYLIN-3975
> URL: https://issues.apache.org/jira/browse/KYLIN-3975
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine, Query Engine
>Reporter: zhao jintao
>Priority: Major
>
> Hi team:
> In bigdata analytics platform, we often query data of the nature week or 
> nature month.
>  For example, in Bank or Accounting reports, the query periods are often a 
> natural week or natural month report.
>  In kylin system, we can build cube to increase query speed. However, it will 
> query slowly if the amount of data is large and the query cycle is long 
> especlially using count distinct measure.
> For example, We can add month dimension to the cube, then merge cube in 
> normal month peroid; but if the query sql has date partition, it will also 
> match the cube has both week dimension and date dimension, kylin need search 
> data from HBase and aggregate data in memory. It also slowly if the amountof 
> data is large.
> Does anyone face the same problem? Who has a better way to solve the problems 
> of nature week or nature month query?
>  
> Best regards
> Thank you.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3967) sum along with case expression does not work in query

2019-04-23 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824158#comment-16824158
 ] 

Shaofeng SHI commented on KYLIN-3967:
-

Please check:

1) col1 is a dimension;

2) in kylin.properties, add "kylin.query.enable-dynamic-column=true" (default 
is false)

> sum along with case expression does not work in query
> -
>
> Key: KYLIN-3967
> URL: https://issues.apache.org/jira/browse/KYLIN-3967
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Gladson Vas
>Priority: Blocker
>
> When i try to run a query with a sum case expression combination,
> eg: select sum(case when col1<0 then 0 else col1 end ) from table 
> i get the following error:
> No realization found for OLAPContext, CUBE_UNMATCHED_AGGREGATION[FunctionDesc 
> [expression=SUM, parameter=CASE(<($8, 0), 0, $8), returnType=null]], 
> rel#36838:OLAPTableScan.OLAP.[](table=[DEFAULT, table],ctx=,fields=[0, 1, 2, 
> 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 
> 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 
> 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 
> 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
> 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 
> 100, 101, 102, 103, 104, 105, 106]) while executing SQL: "select sum (case 
> when col1 <0 then 0 else col1 end ) from table LIMIT 5"
>  
> is there any way to support this sum case expression in the query engine?
> Also I get the same error when the sum operation is done on a column derived 
> from a case expression in a subquery.
> eg: select sum(a.col1) from (select case when col1<0 then 0 else col1 end as 
> col1 from table) a
> Thanks,
> Gladson
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3964) segment overlapped, the status is "NEW", last_build_job_id is null, can not refresh/delete/build this segment

2019-04-21 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822837#comment-16822837
 ] 

Shaofeng SHI commented on KYLIN-3964:
-

What kind of error you got? Did you try a newer version?

> segment overlapped, the status is "NEW", last_build_job_id is null, can not  
> refresh/delete/build this segment
> --
>
> Key: KYLIN-3964
> URL: https://issues.apache.org/jira/browse/KYLIN-3964
> Project: Kylin
>  Issue Type: Bug
> Environment: kylin-2.4.0
>Reporter: hejian
>Priority: Major
> Attachments: image-2019-04-22-10-07-06-737.png
>
>
> !image-2019-04-22-10-07-06-737.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3962) Support streaming cubing using Spark Streaming or Flink

2019-04-21 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822820#comment-16822820
 ] 

Shaofeng SHI commented on KYLIN-3962:
-

Shaohui, we're also looking for better solution if any. Suggestions are 
welcomed.

> Support streaming cubing using Spark Streaming or Flink
> ---
>
> Key: KYLIN-3962
> URL: https://issues.apache.org/jira/browse/KYLIN-3962
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Priority: Major
>
> KYLIN-3654 introduced the Real-time Streaming, but in my opinion, the arch is 
> a little too complicated to handle.
> As streaming frameworks like spark streaming, flink are widely used in many 
> companies.Can we use the streaming framework to support real time cubing in 
> Kylin.
> This is just a proposal. More discussion and suggestions are welcomed~
> More details of this proposal will be added later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-2849) duplicate segment，cannot be deleted and data cannot be refreshed and merged

2019-04-21 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822816#comment-16822816
 ] 

Shaofeng SHI commented on KYLIN-2849:
-

[~hejian999] hejian, please open a new Jira with the necessary information, 
like your REST request, the error trace in kylin.log, etc. Thanks!

> duplicate segment，cannot be deleted and data cannot be refreshed and merged
> ---
>
> Key: KYLIN-2849
> URL: https://issues.apache.org/jira/browse/KYLIN-2849
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine, Metadata, REST Service
>Affects Versions: v2.0.0
> Environment: hadoop：hadoop-2.6.0-cdh5.8.2
> hive ：2.1.0
> hbase：0.98
>Reporter: scott.zhai
>Assignee: Dong Li
>Priority: Major
>  Labels: scope
> Fix For: v2.3.0
>
> Attachments: kylin-1.png, kylin-2.png
>
>
> cube duplicate segments。
> cannot be deleted and data cannot be refreshed and merged
> {code}
> try
> curl -X DELETE 
> "http://127.0.0.1:7070/kylin/api/cubes/Remain_Cube_2/segs/2017082200_2017082300;
>   -H "Authorization: Basic QURNSU46S1lMSU4=" -H "Content-Type: 
> application/json;charset=UTF-8"
> Cannot delete segment '2017082200_2017082300' as it is neither the 
> first nor the last 
> segment.","stacktrace":"org.apache.kylin.rest.exception.InternalErrorException:
>  Cannot delete segment '2017082200_2017082300' as it is neither the 
> first nor the last segment
> {code}
> 暂时解决办法：
> {code}
> public CubeInstance deleteSegment(CubeInstance cube, String segmentName) 
> throws IOException {
> if (!segmentName.equals(cube.getSegments().get(0).getName()) && 
> !segmentName.equals(cube.getSegments().get(cube.getSegments().size() - 
> 1).getName())) {
> //throw new IllegalArgumentException("Cannot delete segment '" + 
> segmentName + "' as it is neither the first nor the last segment.");
> }
> CubeSegment toDelete = null;
> for (CubeSegment seg : cube.getSegments()) {
> if (seg.getName().equals(segmentName)) {
> toDelete = seg;
> }
> }
> if (toDelete == null) {
> throw new IllegalArgumentException("Cannot find segment '" + 
> segmentName + "'");
> }
> if (toDelete.getStatus() != SegmentStatusEnum.READY) {
> //throw new IllegalArgumentException("Cannot delete segment '" + 
> segmentName + "' as its status is not READY. Discard the on-going job for 
> it.");
> }
> CubeUpdate update = new CubeUpdate(cube);
> update.setToRemoveSegs(new CubeSegment[] { toDelete });
> return CubeManager.getInstance(getConfig()).updateCube(update);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3957) Query system_cube get exception Cannot cast "java.math.BigDecimal" to "java.lang.Double"

2019-04-21 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3957:

Fix Version/s: (was: Future)
   v2.6.2
   v3.0.0

> Query system_cube get exception Cannot cast "java.math.BigDecimal" to 
> "java.lang.Double"
> 
>
> Key: KYLIN-3957
> URL: https://issues.apache.org/jira/browse/KYLIN-3957
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v3.0.0, v2.6.2
>
>
> In system cube, the return dataType of column whose real dataType is Double 
> will be converted to Decimal in SUM measure.
> {code:java}
> FunctionDesc function = new FunctionDesc();
> function.setExpression(FunctionDesc.FUNC_SUM);
> function.setParameter(parameterDesc);
> 
> function.setReturnType(dataType.equals(HiveTableCreator.HiveTypeEnum.HDOUBLE.toString())
> ? HiveTableCreator.HiveTypeEnum.HDECIMAL.toString()
> : dataType);
> {code}
> but query with that measure will get exception:
> {code}
> Caused by: org.codehaus.commons.compiler.CompileException: Line 108, Column 
> 44: Cannot cast "java.math.BigDecimal" to "java.lang.Double"
>   at 
> org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:10092)
>   at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3839)
>   at org.codehaus.janino.UnitCompiler.access$6400(UnitCompiler.java:183)
>   at org.codehaus.janino.UnitCompiler$10.visitCast(UnitCompiler.java:3246)
>   at org.codehaus.janino.Java$Cast.accept(Java.java:3802)
>   at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3278)
>   at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:3845)
>   at org.codehaus.janino.UnitCompiler.access$8600(UnitCompiler.java:183)
>   at 
> org.codehaus.janino.UnitCompiler$10.visitParenthesizedExpression(UnitCompiler.java:3274)
>   at 
> org.codehaus.janino.Java$ParenthesizedExpression.accept(Java.java:3830)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3901) Use multi threads to speed up the storage cleanup job

2019-04-15 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3901:

Fix Version/s: (was: v3.0.0-alpha)
   v3.0.0

> Use multi threads to speed up the storage cleanup job
> -
>
> Key: KYLIN-3901
> URL: https://issues.apache.org/jira/browse/KYLIN-3901
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v3.0.0
>
>
> Currently, the storage cleanup job only use one thread to clean up hbase 
> table,  hive table, and hdfs dirs.
> It''s better to use multi threads to speed it up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (KYLIN-3955) Real-time streaming tech blog

2019-04-15 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-3955.
---

> Real-time streaming tech blog
> -
>
> Key: KYLIN-3955
> URL: https://issues.apache.org/jira/browse/KYLIN-3955
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Ma Gang
>Assignee: Ma Gang
>Priority: Major
>
> Real-time streaming tech blog



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3955) Real-time streaming tech blog

2019-04-15 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3955.
-
Resolution: Done

> Real-time streaming tech blog
> -
>
> Key: KYLIN-3955
> URL: https://issues.apache.org/jira/browse/KYLIN-3955
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Ma Gang
>Assignee: Ma Gang
>Priority: Major
>
> Real-time streaming tech blog



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3918) Add project name in cube and job pages

2019-04-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3918.
-
Resolution: Fixed

Merged in master. Thank you shaohui!

> Add project name in cube and job pages
> --
>
> Key: KYLIN-3918
> URL: https://issues.apache.org/jira/browse/KYLIN-3918
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: v3.0.0
>
>
> In a production cluster, there will be many projects and each project has 
> many cubes. It's useful to show project name in cube and job pages.
> So the admin can be quick to known which project the abnormal cube or failed 
> job belongs to and get contact with the users.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3918) Add project name in cube and job pages

2019-04-10 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3918:

Fix Version/s: (was: v2.6.2)
   v3.0.0

> Add project name in cube and job pages
> --
>
> Key: KYLIN-3918
> URL: https://issues.apache.org/jira/browse/KYLIN-3918
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: v3.0.0
>
>
> In a production cluster, there will be many projects and each project has 
> many cubes. It's useful to show project name in cube and job pages.
> So the admin can be quick to known which project the abnormal cube or failed 
> job belongs to and get contact with the users.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3562) TS conflict when kylin update metadata in HBase

2019-04-10 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814252#comment-16814252
 ] 

Shaofeng SHI commented on KYLIN-3562:
-

thank you for the feedback, how often you see such an error?

> TS conflict when kylin update metadata in HBase
> ---
>
> Key: KYLIN-3562
> URL: https://issues.apache.org/jira/browse/KYLIN-3562
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.4.0
>Reporter: Lingang Deng
>Assignee: Jiatao Tao
>Priority: Major
> Fix For: v2.4.2, v2.5.1
>
> Attachments: image-2018-09-17-16-40-56-212.png, 
> image-2018-09-25-15-03-51-009.png, image-2018-09-25-16-43-50-277.png
>
>
> Error log was as follows,
> {code:java}
> org.apache.kylin.common.persistence.WriteConflictException: Overwriting 
> conflict /user/admin, expect old TS 1536928877043, but it is 1536928907207
>      at 
> org.apache.kylin.storage.hbase.HBaseResourceStore.checkAndPutResourceImpl(HBaseResourceStore.java:325)
>      at 
> org.apache.kylin.common.persistence.ResourceStore.checkAndPutResourceCheckpoint(ResourceStore.java:318)
>      at 
> org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:303)
>      at 
> org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:282)
>      at 
> org.apache.kylin.metadata.cachesync.CachedCrudAssist.save(CachedCrudAssist.java:192){code}
>  
> what disturbs me the most was that the error was happened several hours,  
> then my all build job and query job failed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (KYLIN-3831) 唯独超过62生成cuboid 错误

2019-04-08 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-3831.
---

> 唯独超过62生成cuboid 错误
> -
>
> Key: KYLIN-3831
> URL: https://issues.apache.org/jira/browse/KYLIN-3831
> Project: Kylin
>  Issue Type: Wish
>  Components: Others
>Affects Versions: v2.3.1
>Reporter: zhangwei
>Assignee: zhangwei
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3930) ArrayIndexOutOfBoundsException when building

2019-04-08 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812384#comment-16812384
 ] 

Shaofeng SHI commented on KYLIN-3930:
-

The non-sharded storage are not support after v1.5 I think, though it didn't 
report error. Please keep in old version, or switch to the sharded HBase 
storage (storage type = 2).

> ArrayIndexOutOfBoundsException when building
> 
>
> Key: KYLIN-3930
> URL: https://issues.apache.org/jira/browse/KYLIN-3930
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: all
>Reporter: Jacky Woo
>Priority: Major
> Fix For: v2.6.2
>
> Attachments: KYLIN-3930.master.01.patch
>
>
> h2. ArrayIndexOutOfBoundsException when building.
> I hive a cube building error with kylin-2.5.0:
> {code:java}
> 2019-03-31 02:45:18,460 ERROR [main] org.apache.kylin.engine.mr.KylinMapper:
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.kylin.engine.mr.common.NDCuboidBuilder.buildKeyInternal(NDCuboidBuilder.java:106)
> at 
> org.apache.kylin.engine.mr.common.NDCuboidBuilder.buildKey(NDCuboidBuilder.java:71)
> at 
> org.apache.kylin.engine.mr.steps.NDCuboidMapper.doMap(NDCuboidMapper.java:112)
> at 
> org.apache.kylin.engine.mr.steps.NDCuboidMapper.doMap(NDCuboidMapper.java:47)
> at org.apache.kylin.engine.mr.KylinMapper.map(KylinMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:796)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {code}
> I checked the code of "NDCuboidBuilder.buildKeyInternal" method
> {code:java}
> private void buildKeyInternal(Cuboid parentCuboid, Cuboid childCuboid, 
> ByteArray[] splitBuffers, ByteArray newKeyBodyBuf) {
> RowKeyEncoder rowkeyEncoder = 
> rowKeyEncoderProvider.getRowkeyEncoder(childCuboid);
> // rowkey columns
> long mask = Long.highestOneBit(parentCuboid.getId());
> long parentCuboidId = parentCuboid.getId();
> long childCuboidId = childCuboid.getId();
> long parentCuboidIdActualLength = (long)Long.SIZE - 
> Long.numberOfLeadingZeros(parentCuboid.getId());
> int index = rowKeySplitter.getBodySplitOffset(); // skip shard and 
> cuboidId
> int offset = RowConstants.ROWKEY_SHARDID_LEN + 
> RowConstants.ROWKEY_CUBOIDID_LEN; // skip shard and cuboidId
> for (int i = 0; i < parentCuboidIdActualLength; i++) {
> if ((mask & parentCuboidId) > 0) {// if the this bit position 
> equals
> // 1
> if ((mask & childCuboidId) > 0) {// if the child cuboid has 
> this
> // column
> System.arraycopy(splitBuffers[index].array(), 
> splitBuffers[index].offset(), newKeyBodyBuf.array(), offset, 
> splitBuffers[index].length());
> offset += splitBuffers[index].length();
> }
> index++;
> }
> mask = mask >> 1;
> }
> rowkeyEncoder.fillHeader(newKeyBodyBuf.array());
> }
> {code}
> Found that "offset = SHARDID_LEN + CUBOIDID_LEN" , which is wrong when cube 
> is not sharding. In my case my cube's storage type is 0, which means it is 
> not sharding.
> So, I set offset according to cube sharding, like below:
> {code:java}
> int offset = rowKeySplitter.getHeaderLength(); // skip shard and cuboidId
> {code}
> After modifying building succeeds in my environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3934) sqoop import param '--null-string' result in null value become blank string in hive table

2019-04-08 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812382#comment-16812382
 ] 

Shaofeng SHI commented on KYLIN-3934:
-

Hao, would you like to raise a PR to Kylin? Thank you!

> sqoop import param '--null-string' result in null value become blank string 
> in hive table
> -
>
> Key: KYLIN-3934
> URL: https://issues.apache.org/jira/browse/KYLIN-3934
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: v2.6.0
>Reporter: wanghao
>Priority: Major
> Fix For: v2.6.2
>
>
> when column value from jdbc is null, sqoop will write it into hive table with 
> blank string.
> eg 
> jdbc:
> A | B
> 1 | 1
> 2 | 2
> a | null
>  
> hive table:
> A | B
> 1 | 1
> 2 | 2
> a |
> because of this, when I use count(distinct B), it return 3, not 2, and it can 
> lead to other problems
>  
>  
> {code:java}
> String cmd = String.format(Locale.ROOT,
> "%s/bin/sqoop import" + generateSqoopConfigArgString()
> + "--connect \"%s\" --driver %s --username %s --password %s --query \"%s AND 
> \\$CONDITIONS\" "
> + "--target-dir %s/%s --split-by %s --boundary-query \"%s\" --null-string '' "
> + "--fields-terminated-by '%s' --num-mappers %d",
> sqoopHome, connectionUrl, driverClass, jdbcUser, jdbcPass, selectSql, 
> jobWorkingDir, hiveTable,
> splitColumn, bquery, filedDelimiter, mapperNum);
> {code}
> the param '–null=string' should be '
> \\N' instead of blank string ''.
> I resolved this problem by replace the param. But it needs be configured in 
> kylin.properties
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3933) Currently replica set related operation need refresh current front-end page

2019-04-08 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3933:

Fix Version/s: v3.0.0

> Currently replica set related operation need refresh current front-end page
> ---
>
> Key: KYLIN-3933
> URL: https://issues.apache.org/jira/browse/KYLIN-3933
> Project: Kylin
>  Issue Type: Bug
>  Components: Real-time Streaming, Web 
>Reporter: Chao Long
>Priority: Minor
> Fix For: v3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3936) MR/Spark task will still run after the job is stopped.

2019-04-08 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3936:

Fix Version/s: v2.6.2

> MR/Spark task will still run after the job is stopped.
> --
>
> Key: KYLIN-3936
> URL: https://issues.apache.org/jira/browse/KYLIN-3936
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Reporter: Guangxu Cheng
>Assignee: Guangxu Cheng
>Priority: Major
> Fix For: v2.6.2
>
>
> The command "pause" only sets status of the job to "stopped" and does not 
> reset the status of the subtask.
> So, In SparkExecutable, we can't get the real status of the running task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3818) After Cube disabled, auto-merge cube job still running

2019-04-08 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3818.
-
Resolution: Fixed

> After Cube disabled, auto-merge cube job still running
> --
>
> Key: KYLIN-3818
> URL: https://issues.apache.org/jira/browse/KYLIN-3818
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v2.6.0
>Reporter: Na Zhai
>Assignee: Na Zhai
>Priority: Major
> Fix For: v3.0.0-alpha
>
>
> *precondition*
> There is a Cube that turns on the auto-merge feature. And it satisfied the 
> condition of the auto-merge. Then the job of merging segments begins.
> After a few minutes, the job of merging segments goes into the error status, 
> so I discard the job. Then I disable this cube, but a new job of merging 
> segments begin to run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3843) List kylin instances with their server mode on web

2019-04-08 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3843:

Fix Version/s: v3.0.0

> List kylin instances with their server mode on web
> --
>
> Key: KYLIN-3843
> URL: https://issues.apache.org/jira/browse/KYLIN-3843
> Project: Kylin
>  Issue Type: New Feature
>  Components: REST Service, Web 
>Reporter: nichunen
>Assignee: Jiatao Tao
>Priority: Major
> Fix For: v3.0.0
>
>
> As Curator-based scheduler is available now, so Kylin can list all nodes with 
> the same metadata url.
> This task should include some rest apis to fetch nodes information on ZK, and 
> front page on System page to display the nodes information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KYLIN-3873) Fix inappropriate use of memory in SparkFactDistinct.java

2019-04-08 Thread Shaofeng SHI (JIRA)



[ 
https://issues.apache.org/jira/browse/KYLIN-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812318#comment-16812318
 ] 

Shaofeng SHI commented on KYLIN-3873:
-

[~Wayne0101] please make this to 2.6.x branch; the commit couldn't be directly 
applied in 2.6.x

>  Fix inappropriate use of memory in SparkFactDistinct.java
> --
>
> Key: KYLIN-3873
> URL: https://issues.apache.org/jira/browse/KYLIN-3873
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Chao Long
>Assignee: Chao Long
>Priority: Major
> Fix For: v2.6.2
>
>
> Class SparkFactDistinct.java has some inappropriate use of memory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3879) Implement FlinkEntry

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3879.
-
Resolution: Fixed

> Implement FlinkEntry
> 
>
> Key: KYLIN-3879
> URL: https://issues.apache.org/jira/browse/KYLIN-3879
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Flink Engine
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3877) Add flink specific config items for kylin properties configuration files

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3877.
-
Resolution: Fixed

> Add flink specific config items for kylin properties configuration files
> 
>
> Key: KYLIN-3877
> URL: https://issues.apache.org/jira/browse/KYLIN-3877
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Flink Engine
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Add Flink specific configuration, such as JM/TM memory, slot num and so on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3887) Query with decimal sum measure of double complied failed after KYLIN-3703

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3887:

Fix Version/s: v2.6.2

> Query with decimal sum measure of double complied failed after KYLIN-3703
> -
>
> Key: KYLIN-3887
> URL: https://issues.apache.org/jira/browse/KYLIN-3887
> Project: Kylin
>  Issue Type: Bug
>Reporter: Liu Shaohui
>Priority: Major
> Fix For: v2.6.2
>
>
> After KYLIN-3703, Query with decimal sum measure of double complied failed.
> {code:java}
> Caused by: org.codehaus.commons.compiler.CompileException: 
> Line 112, Column 42: Cannot cast "java.math.BigDecimal" to 
> "java.lang.Double"{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3885) Build dimension dictionary job costs too long when using Spark fact distinct

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3885.
-
Resolution: Fixed

> Build dimension dictionary job costs too long when using Spark fact distinct
> 
>
> Key: KYLIN-3885
> URL: https://issues.apache.org/jira/browse/KYLIN-3885
> Project: Kylin
>  Issue Type: Bug
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v2.6.2
>
>
> Build dimension dictionary job costs less than 20 minutes when using 
> mapreduce fact distinct,but but it costs more than 3 hours when using spark 
> fact distinct.
> {code:java}
> "Scheduler 542945608 Job 05c62aca-853f-396e-9653-f20c9ebd8ebc-329" #329 
> prio=5 os_prio=0 tid=0x7f312109c800 nid=0x2dc0b in Object.wait() 
> [0x7f30d8d24000]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.hadoop.ipc.Client.call(Client.java:1482)
> - locked <0x0005c3110fc0> (a org.apache.hadoop.ipc.Client$Call)
> at org.apache.hadoop.ipc.Client.call(Client.java:1427)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> at com.sun.proxy.$Proxy33.delete(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:573)
> at sun.reflect.GeneratedMethodAccessor193.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:249)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:107)
> at com.sun.proxy.$Proxy34.delete(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2057)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:682)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:696)
> at 
> org.apache.hadoop.fs.FilterFileSystem.delete(FilterFileSystem.java:232)
> at 
> org.apache.hadoop.fs.viewfs.ChRootedFileSystem.delete(ChRootedFileSystem.java:198)
> at 
> org.apache.hadoop.fs.viewfs.ViewFileSystem.delete(ViewFileSystem.java:334)
> at 
> org.apache.hadoop.hdfs.FederatedDFSFileSystem.delete(FederatedDFSFileSystem.java:232)
> at 
> org.apache.kylin.dict.global.GlobalDictHDFSStore.deleteSlice(GlobalDictHDFSStore.java:211)
> at 
> org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.flushCurrentNode(AppendTrieDictionaryBuilder.java:137)
> at 
> org.apache.kylin.dict.global.AppendTrieDictionaryBuilder.addValue(AppendTrieDictionaryBuilder.java:97)
> at 
> org.apache.kylin.dict.GlobalDictionaryBuilder.addValue(GlobalDictionaryBuilder.java:85)
> at 
> org.apache.kylin.dict.DictionaryGenerator.buildDictionary(DictionaryGenerator.java:82)
> at 
> org.apache.kylin.dict.DictionaryManager.buildDictFromReadableTable(DictionaryManager.java:303)
> at 
> org.apache.kylin.dict.DictionaryManager.buildDictionary(DictionaryManager.java:290)
> at 
> org.apache.kylin.cube.CubeManager$DictionaryAssist.buildDictionary(CubeManager.java:1043)
> at 
> org.apache.kylin.cube.CubeManager.buildDictionary(CubeManager.java:1012)
> at 
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:72)
> at 
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:50)
> at 
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:73)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:92)
> at 
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:63)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:178)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
> at 
>

[jira] [Closed] (KYLIN-3890) Add doc about usage of ./bin/metadata.sh

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI closed KYLIN-3890.
---

> Add doc about usage of ./bin/metadata.sh
> 
>
> Key: KYLIN-3890
> URL: https://issues.apache.org/jira/browse/KYLIN-3890
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Minor
>
> JIRA title descript the JIRA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3892) Set cubing job priority

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3892:

Fix Version/s: v2.6.2

> Set cubing job priority
> ---
>
> Key: KYLIN-3892
> URL: https://issues.apache.org/jira/browse/KYLIN-3892
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Affects Versions: v2.4.0, v2.5.0, v2.6.0
>Reporter: Temple Zhou
>Assignee: Temple Zhou
>Priority: Minor
> Fix For: v2.6.2
>
>
> The cubing job with high priority will be delayed when there are too many 
> tasks running. 
> So I want to set the job priority for the important cubing jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3890) Add doc about usage of ./bin/metadata.sh

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3890.
-
Resolution: Fixed
  Assignee: Yuzhang QIU

> Add doc about usage of ./bin/metadata.sh
> 
>
> Key: KYLIN-3890
> URL: https://issues.apache.org/jira/browse/KYLIN-3890
> Project: Kylin
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: v2.5.2
>Reporter: Yuzhang QIU
>Assignee: Yuzhang QIU
>Priority: Minor
>
> JIRA title descript the JIRA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3896) Implement IFlinkOutput based on HBase

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3896.
-
Resolution: Fixed

> Implement IFlinkOutput based on HBase
> -
>
> Key: KYLIN-3896
> URL: https://issues.apache.org/jira/browse/KYLIN-3896
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Flink Engine
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3897) Implement IFlinkInput based on Hive

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3897.
-
Resolution: Fixed

> Implement IFlinkInput based on Hive
> ---
>
> Key: KYLIN-3897
> URL: https://issues.apache.org/jira/browse/KYLIN-3897
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Flink Engine
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3900) Discard all expired ERROR or STOPPED jobs to cleanup kylin metadata

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3900:
---

Assignee: Liu Shaohui

> Discard all expired ERROR or STOPPED jobs to cleanup kylin metadata
> ---
>
> Key: KYLIN-3900
> URL: https://issues.apache.org/jira/browse/KYLIN-3900
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
>
> Currently metadata cleanup job only delete expired  discarded and succeed 
> jobs, ERROR or STOPPED jobs are left which may cause too many meta in hbase 
> in a long term.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (KYLIN-3901) Use multi threads to speed up the storage cleanup job

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI reassigned KYLIN-3901:
---

Assignee: Liu Shaohui

> Use multi threads to speed up the storage cleanup job
> -
>
> Key: KYLIN-3901
> URL: https://issues.apache.org/jira/browse/KYLIN-3901
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
>
> Currently, the storage cleanup job only use one thread to clean up hbase 
> table,  hive table, and hdfs dirs.
> It''s better to use multi threads to speed it up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KYLIN-3901) Use multi threads to speed up the storage cleanup job

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3901:

Fix Version/s: v3.0.0-alpha

> Use multi threads to speed up the storage cleanup job
> -
>
> Key: KYLIN-3901
> URL: https://issues.apache.org/jira/browse/KYLIN-3901
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Major
> Fix For: v3.0.0-alpha
>
>
> Currently, the storage cleanup job only use one thread to clean up hbase 
> table,  hive table, and hdfs dirs.
> It''s better to use multi threads to speed it up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3914) Download required dependencies jar from maven central repository for FlinkCubeHFile

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3914.
-
Resolution: Fixed

> Download required dependencies jar from maven central repository for 
> FlinkCubeHFile
> ---
>
> Key: KYLIN-3914
> URL: https://issues.apache.org/jira/browse/KYLIN-3914
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Flink Engine
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> It needs these dependencies :
>  * hbase-common
>  * hbase-server
>  * hbase-client
>  * hbase-protocol
>  * hbase-hadoop-compat
>  * metrics-core(yammer)
>  * htrace-core
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KYLIN-3916) Fix cube build action issue after streaming migrate

2019-04-07 Thread Shaofeng SHI (JIRA)



 [ 
https://issues.apache.org/jira/browse/KYLIN-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI resolved KYLIN-3916.
-
Resolution: Fixed

> Fix cube build action issue after streaming migrate
> ---
>
> Key: KYLIN-3916
> URL: https://issues.apache.org/jira/browse/KYLIN-3916
> Project: Kylin
>  Issue Type: Bug
>  Components: Web 
>Reporter: Pan, Julian
>Assignee: Pan, Julian
>Priority: Major
> Fix For: v3.0.0-alpha
>
>
> Cube cannot build after streaming migrate to master branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4200 matches

Mail list logo