Hi Guys, I write a simple proposal of real time support in kylin as below. Please help to review!
1. Kafka + storm will build inverted index in memory. These index will be inserted into hbase by batch (e.g. every 5 minutes). 2. The inverted index in hbase will keep the short term data (e.g. 7 days). These index will be converted into data cube by batch (e.g. every 7 day). 3. The data cube in hbase will keep the long term data. 4. Query engine will decide to use inverted index or data cube in hbase by time range. In future, the query engine can also use the in-memory inverted index in storm that can reduce the data latency from minutes to seconds. Thanks Jiang Xu
