[jira] [Commented] (KYLIN-2457) Should copy the latest dictionaries on dimension tables in a batch merge job

2017-03-17 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930989#comment-15930989
 ] 

Shaofeng SHI commented on KYLIN-2457:
-

Good catch, thanks zhengdong. 

Chenney, would you like to share your tool? Kylin has a "tool" module, in which 
there are a couple of CLI for different purposes, your tool can be added there.

> Should copy the latest dictionaries on dimension tables in a batch merge job
> 
>
> Key: KYLIN-2457
> URL: https://issues.apache.org/jira/browse/KYLIN-2457
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v1.6.0
>Reporter: zhengdong
>Assignee: zhengdong
>Priority: Critical
> Fix For: v2.0.0
>
> Attachments: 
> 0001-KYLIN-2457-Should-copy-the-latest-dictionaries-on-di.patch, kylintools.7z
>
>
> In a batch merge job, we need to create dictionaries for all dimensions for 
> the new segment. For those dictionaries on dimension table, we currently just 
> copy them from the earliest segment of the merging segments. 
> However, we should select the newest dictionary for the new segment, since 
> the incremental dimension table is allowed. The older dictionary can't find 
> the records corresponding to the new key added to a dimension table and  lead 
> wrong query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2457) Should copy the latest dictionaries on dimension tables in a batch merge job

2017-02-28 Thread Dayue Gao (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887881#comment-15887881
 ] 

Dayue Gao commented on KYLIN-2457:
--

Merged to master 
https://github.com/apache/kylin/commit/a8001226b2a07cd553e680b7e14de9bf8c9981f3

[~zhengd], nice work! Thank you for your contribution!

> Should copy the latest dictionaries on dimension tables in a batch merge job
> 
>
> Key: KYLIN-2457
> URL: https://issues.apache.org/jira/browse/KYLIN-2457
> Project: Kylin
>  Issue Type: Bug
>Reporter: zhengdong
>Priority: Critical
> Attachments: 
> 0001-KYLIN-2457-Should-copy-the-latest-dictionaries-on-di.patch
>
>
> In a batch merge job, we need to create dictionaries for all dimensions for 
> the new segment. For those dictionaries on dimension table, we currently just 
> copy them from the earliest segment of the merging segments. 
> However, we should select the newest dictionary for the new segment, since 
> the incremental dimension table is allowed. The older dictionary can't find 
> the records corresponding to the new key added to a dimension table and  lead 
> wrong query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2457) Should copy the latest dictionaries on dimension tables in a batch merge job

2017-02-28 Thread zhengdong (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15887498#comment-15887498
 ] 

zhengdong commented on KYLIN-2457:
--

Thanks Dayue Gao for your advice. I have updated the patch file.

> Should copy the latest dictionaries on dimension tables in a batch merge job
> 
>
> Key: KYLIN-2457
> URL: https://issues.apache.org/jira/browse/KYLIN-2457
> Project: Kylin
>  Issue Type: Bug
>Reporter: zhengdong
>Priority: Critical
> Attachments: 
> 0001-KYLIN-2457-Should-copy-the-latest-dictionaries-on-di.patch
>
>
> In a batch merge job, we need to create dictionaries for all dimensions for 
> the new segment. For those dictionaries on dimension table, we currently just 
> copy them from the earliest segment of the merging segments. 
> However, we should select the newest dictionary for the new segment, since 
> the incremental dimension table is allowed. The older dictionary can't find 
> the records corresponding to the new key added to a dimension table and  lead 
> wrong query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2457) Should copy the latest dictionaries on dimension tables in a batch merge job

2017-02-22 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878475#comment-15878475
 ] 

Billy Liu commented on KYLIN-2457:
--

+1

> Should copy the latest dictionaries on dimension tables in a batch merge job
> 
>
> Key: KYLIN-2457
> URL: https://issues.apache.org/jira/browse/KYLIN-2457
> Project: Kylin
>  Issue Type: Bug
>Reporter: zhengdong
>Priority: Critical
> Attachments: 
> KYLIN-2457-Should-copy-the-latest-dictionaries-on-di.patch
>
>
> In a batch merge job, we need to create dictionaries for all dimensions for 
> the new segment. For those dictionaries on dimension table, we currently just 
> copy them from the earliest segment of the merging segments. 
> However, we should select the newest dictionary for the new segment, since 
> the incremental dimension table is allowed. The older dictionary can't find 
> the records corresponding to the new key added to a dimension table and  lead 
> wrong query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KYLIN-2457) Should copy the latest dictionaries on dimension tables in a batch merge job

2017-02-22 Thread Dayue Gao (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878043#comment-15878043
 ] 

Dayue Gao commented on KYLIN-2457:
--

+1. Hi [~zhengd], it would be better if you update the comments of 
{{makeDictForNewSegment}} and {{makeSnapshotForNewSegment}}?

> Should copy the latest dictionaries on dimension tables in a batch merge job
> 
>
> Key: KYLIN-2457
> URL: https://issues.apache.org/jira/browse/KYLIN-2457
> Project: Kylin
>  Issue Type: Bug
>Reporter: zhengdong
>Priority: Critical
> Attachments: 
> KYLIN-2457-Should-copy-the-latest-dictionaries-on-di.patch
>
>
> In a batch merge job, we need to create dictionaries for all dimensions for 
> the new segment. For those dictionaries on dimension table, we currently just 
> copy them from the earliest segment of the merging segments. 
> However, we should select the newest dictionary for the new segment, since 
> the incremental dimension table is allowed. The older dictionary can't find 
> the records corresponding to the new key added to a dimension table and  lead 
> wrong query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)