[ 
https://issues.apache.org/jira/browse/KYLIN-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangxiaojing updated KYLIN-4335:
--------------------------------
    Description: 
At present, global dictionary can reuse other columns in the same cube, but 
cannot reuse the dictionaries of other cubes in the cluster.
 It is suggested that not only the global dictionaries of the same cube can be 
reused, but also the global dictionaries of other cubes can be reused, which 
can greatly reduce the build time of dictionaries and avoid the repeated build 
of dictionaries.

*Global domain dictionary general realization idea:*

 
{panel:title=Definition}
the global dictionary column of a cube can rely on the global dictionary column 
of any other cube in the cluster, which is used to achieve the purpose of 
dictionary reuse, reduce repeated construction, and implementation of OneID
. At present, Kylin has realized the reuse of global dictionary columns within 
the same cube, but it has not realized the reuse of dictionary columns across 
the cube or even across projects.
{panel}
 
{panel:title=Step implementation:}
Step 1, When the user needs to use the global domain dictionary, enable the 
global dictionary option in advanced dictionaries and set reuse 
column(model_name.cube_name.table_name.column);

Step 2, kylin uses the reused dictionary path to replace the original column's 
dictionary's path where getDictionary(CubeSegment cubeSeg, TblColRef col) 
method is called. The implementation method is similar to the previous cube's 
internal reuse;

Step 3, When build cube, it mainly involves uploading dictionary data to HDFS 
when build baseid. It is necessary to upload the used project/model/cube and 
dictionary information together.
{panel}
 

  was:
At present, global dictionary can reuse other columns in the same cube, but 
cannot reuse the dictionaries of other cubes in the cluster.
It is suggested that not only the global dictionaries of the same cube can be 
reused, but also the global dictionaries of other cubes can be reused, which 
can greatly reduce the build time of dictionaries and avoid the repeated build 
of dictionaries.


> Reuse global dictionary from other cube,domain dict
> ---------------------------------------------------
>
>                 Key: KYLIN-4335
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4335
>             Project: Kylin
>          Issue Type: New Feature
>          Components: Job Engine
>    Affects Versions: Future
>            Reporter: wangxiaojing
>            Assignee: wangxiaojing
>            Priority: Major
>
> At present, global dictionary can reuse other columns in the same cube, but 
> cannot reuse the dictionaries of other cubes in the cluster.
>  It is suggested that not only the global dictionaries of the same cube can 
> be reused, but also the global dictionaries of other cubes can be reused, 
> which can greatly reduce the build time of dictionaries and avoid the 
> repeated build of dictionaries.
> *Global domain dictionary general realization idea:*
>  
> {panel:title=Definition}
> the global dictionary column of a cube can rely on the global dictionary 
> column of any other cube in the cluster, which is used to achieve the purpose 
> of dictionary reuse, reduce repeated construction, and implementation of OneID
> . At present, Kylin has realized the reuse of global dictionary columns 
> within the same cube, but it has not realized the reuse of dictionary columns 
> across the cube or even across projects.
> {panel}
>  
> {panel:title=Step implementation:}
> Step 1, When the user needs to use the global domain dictionary, enable the 
> global dictionary option in advanced dictionaries and set reuse 
> column(model_name.cube_name.table_name.column);
> Step 2, kylin uses the reused dictionary path to replace the original 
> column's dictionary's path where getDictionary(CubeSegment cubeSeg, TblColRef 
> col) method is called. The implementation method is similar to the previous 
> cube's internal reuse;
> Step 3, When build cube, it mainly involves uploading dictionary data to HDFS 
> when build baseid. It is necessary to upload the used project/model/cube and 
> dictionary information together.
> {panel}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to