[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-22 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: (was: mock_message.py)

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, 
> image-2019-09-22-14-54-07-772.png, image-2019-09-22-16-10-41-831.png, 
> image-2019-09-22-16-14-15-963.png, image-2019-09-22-16-28-54-593.png, 
> image-2019-09-22-20-40-06-476.png, image-2019-09-22-20-41-13-146.png, 
> mock_message.py
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-22 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: mock_message.py

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, 
> image-2019-09-22-14-54-07-772.png, image-2019-09-22-16-10-41-831.png, 
> image-2019-09-22-16-14-15-963.png, image-2019-09-22-16-28-54-593.png, 
> image-2019-09-22-20-40-06-476.png, image-2019-09-22-20-41-13-146.png, 
> mock_message.py
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-22 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: mock_message.py

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, 
> image-2019-09-22-14-54-07-772.png, mock_message.py
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-22 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Description: 
h2. Backgroud
Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.
Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage
 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

h2. Design Diagram

!image-2019-09-20-19-04-47-937.png!

!image-2019-09-20-19-04-55-935.png!

 
!image-2019-09-20-20-06-15-960.png!

  was:
h2.  
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

h2. Design Diagram

!image-2019-09-20-19-04-47-937.png!

!image-2019-09-20-19-04-55-935.png!

 

!image-2019-09-20-20-06-15-960.png!

 


> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, 
> image-2019-09-22-14-54-07-772.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Description: 
h2.  
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

h2. Design Diagram

!image-2019-09-20-19-04-47-937.png!

!image-2019-09-20-19-04-55-935.png!

 

!image-2019-09-20-20-06-15-960.png!

 

  was:
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

 

!image-2019-09-20-19-04-47-937.png!

!image-2019-09-20-19-04-55-935.png!


> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png
>
>
> h2.  
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>  
> !image-2019-09-20-20-06-15-960.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: image-2019-09-20-20-06-15-960.png

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
>  
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Description: 
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

 

!image-2019-09-20-19-04-47-937.png!

!image-2019-09-20-19-04-55-935.png!

  was:
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT


> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT
>  
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: image-2019-09-20-19-04-47-937.png

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Attachment: image-2019-09-20-19-04-55-935.png

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png, 
> image-2019-09-20-19-04-55-935.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-20 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Description: 
h2. Backgroud

Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger 
type.

Because of the lack the ability of encoding string at once, so I want to use 
RocksDB & HBase as implementation of streaming distributed dictionary. 
h2. Design
 # each receiver will own a local dict cache
 # all receiver will share a remote dict storage
 # we choose to use RocksDB as local dict cache
 # we choose to use HBase as remote dict storage

 
 # for each cube, we will create a local dict and a hbase table
 # we will create column family both in RocksDB and HBase for each column which 
occur in COUNT_DISTINCT

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non 
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use 
> RocksDB & HBase as implementation of streaming distributed dictionary. 
> h2. Design
>  # each receiver will own a local dict cache
>  # all receiver will share a remote dict storage
>  # we choose to use RocksDB as local dict cache
>  # we choose to use HBase as remote dict storage
>  
>  # for each cube, we will create a local dict and a hbase table
>  # we will create column family both in RocksDB and HBase for each column 
> which occur in COUNT_DISTINCT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4141) Build Global Dictionary in no time

2019-09-16 Thread Xiaoxiang Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoxiang Yu updated KYLIN-4141:

Fix Version/s: v3.0.0-beta

> Build Global Dictionary in no time
> --
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
>  Issue Type: Improvement
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-beta
>Reporter: Xiaoxiang Yu
>Assignee: Xiaoxiang Yu
>Priority: Major
> Fix For: v3.0.0-beta
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)