[ https://issues.apache.org/jira/browse/KYLIN-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoxiang Yu updated KYLIN-4141: -------------------------------- Attachment: (was: mock_message.py) > Build Global Dictionary in no time > ---------------------------------- > > Key: KYLIN-4141 > URL: https://issues.apache.org/jira/browse/KYLIN-4141 > Project: Kylin > Issue Type: Improvement > Components: Real-time Streaming > Affects Versions: v3.0.0-beta > Reporter: Xiaoxiang Yu > Assignee: Xiaoxiang Yu > Priority: Major > Fix For: v3.0.0-beta > > Attachments: image-2019-09-20-19-04-47-937.png, > image-2019-09-20-19-04-55-935.png, image-2019-09-20-20-06-15-960.png, > image-2019-09-22-14-54-07-772.png, image-2019-09-22-16-10-41-831.png, > image-2019-09-22-16-14-15-963.png, image-2019-09-22-16-28-54-593.png, > image-2019-09-22-20-40-06-476.png, image-2019-09-22-20-41-13-146.png, > mock_message.py > > > h2. Backgroud > Currently, realtime OLAP do not support COUNT_DISTINCT(bitmap) for non > interger type. > Because of the lack the ability of encoding string at once, so I want to use > RocksDB & HBase as implementation of streaming distributed dictionary. > h2. Design > # each receiver will own a local dict cache > # all receiver will share a remote dict storage > # we choose to use RocksDB as local dict cache > # we choose to use HBase as remote dict storage > > # for each cube, we will create a local dict and a hbase table > # we will create column family both in RocksDB and HBase for each column > which occur in COUNT_DISTINCT > h2. Design Diagram > !image-2019-09-20-19-04-47-937.png! > !image-2019-09-20-19-04-55-935.png! > > !image-2019-09-20-20-06-15-960.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)