[
https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoxiang Yu updated KYLIN-4141:
--------------------------------
Description:
h2. Backgroud
Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger
type.
Because of the lack the ability of encoding string at once, so I want to use
RocksDB & HBase as implementation of streaming distributed dictionary.
h2. Design
# each receiver will own a local dict cache
# all receiver will share a remote dict storage
# we choose to use RocksDB as local dict cache
# we choose to use HBase as remote dict storage
# for each cube, we will create a local dict and a hbase table
# we will create column family both in RocksDB and HBase for each column which
occur in COUNT_DISTINCT
h2. Design Diagram
!image-2019-09-20-19-04-47-937.png!
!image-2019-09-20-19-04-55-935.png!
!image-2019-09-20-20-06-15-960.png!
was:
h2.
h2. Backgroud
Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non interger
type.
Because of the lack the ability of encoding string at once, so I want to use
RocksDB & HBase as implementation of streaming distributed dictionary.
h2. Design
# each receiver will own a local dict cache
# all receiver will share a remote dict storage
# we choose to use RocksDB as local dict cache
# we choose to use HBase as remote dict storage
# for each cube, we will create a local dict and a hbase table
# we will create column family both in RocksDB and HBase for each column which
occur in COUNT_DISTINCT
h2. Design Diagram
!image-2019-09-20-19-04-47-937.png!
!image-2019-09-20-19-04-55-935.png!
!image-2019-09-20-20-06-15-960.png!
> Build Global Dictionary in no time
> ----------------------------------
>
> Key: KYLIN-4141
> URL: https://issues.apache.org/jira/browse/KYLIN-4141
> Project: Kylin
> Issue Type: Improvement
> Components: Real-time Streaming
> Affects Versions: v3.0.0-beta
> Reporter: Xiaoxiang Yu
> Assignee: Xiaoxiang Yu
> Priority: Major
> Fix For: v3.0.0-beta
>
> Attachments: image-2019-09-20-19-04-47-937.png,
> image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png,
> image-2019-09-22-14-54-07-772.png
>
>
> h2. Backgroud
> Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non
> interger type.
> Because of the lack the ability of encoding string at once, so I want to use
> RocksDB & HBase as implementation of streaming distributed dictionary.
> h2. Design
> # each receiver will own a local dict cache
> # all receiver will share a remote dict storage
> # we choose to use RocksDB as local dict cache
> # we choose to use HBase as remote dict storage
>
> # for each cube, we will create a local dict and a hbase table
> # we will create column family both in RocksDB and HBase for each column
> which occur in COUNT_DISTINCT
> h2. Design Diagram
> !image-2019-09-20-19-04-47-937.png!
> !image-2019-09-20-19-04-55-935.png!
>
> !image-2019-09-20-20-06-15-960.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)