Gas is mostly right, with one addition that, query can hit both inverted-index and cube if it asks for both latest and historic data. The result from two sources will get aggregated at query time.
On Fri, Sep 18, 2015 at 11:26 PM, Gaspare Maria < [email protected]> wrote: > Hi, > > so if I understood the idea behind Kylin Real Time is: > > * Inverted Indexes (maybe Lucene or inverted indexes on HBase) will > be built according to CUBE Schema in near-realtime by using Spark > (streaming) Kafka Consumers; > * On query Time if the query impacts latest data it will be routed to > Inverted Indexes otherwise on the CUBE on HBase. > * Query that impacts latest data should be limited due to limitation > of inverted indexes; > * Query on long period of time back (e.g. from now back to 2 months > ago) will be routed part on HBase and part on Inverted Indexes. > > > Am I right? > > Regards, > > -- gas > > > > On 09/18/2015 12:35 AM, Henry Saputra wrote: > >> Awesome, thanks Luke >> >> On Thu, Sep 17, 2015 at 2:37 AM, Luke Han <[email protected]> wrote: >> >>> Here's JIRA: https://issues.apache.org/jira/browse/KYLIN-599 >>> >>> >>> Best Regards! >>> --------------------- >>> >>> Luke Han >>> >>> On Thu, Sep 17, 2015 at 1:09 AM, Henry Saputra <[email protected]> >>> wrote: >>> >>> That is good to know. Li Yang, Luke, could one of you share the design >>>> document for this realtime OLAP query in the JIRA? >>>> >>>> Thanks, >>>> >>>> - Henry >>>> >>>> On Tue, Sep 15, 2015 at 11:12 PM, Li Yang <[email protected]> wrote: >>>> >>>>> There will be incremental updates on the existing cubes, but during >>>>>> that updates I suppose no queries will be ran against them? >>>>>> >>>>> Yes, it's mini batch, usually at minutes interval. And of course cube >>>>> CAN >>>>> serve query while the mini incremental is under built. How can we let >>>>> the >>>>> cube offline every few minutes, that's impossible. :-) >>>>> >>>>> On Tue, Sep 15, 2015 at 6:41 PM, Sarnath <[email protected]> wrote: >>>>> >>>>> Inverted index? That sounds interesting. We use inverted index to serve >>>>>> >>>>> the >>>> >>>>> cubes in our internal implementation. >>>>>> >>>>>> I come from Big Data Center of excellence from an Indian IT major. >>>>>> >>>>>> We have been experimenting with the idea of serving cubes through >>>>>> ElasticSearch REST API. This is not related to Kylin. This is our own >>>>>> internal development. >>>>>> >>>>>> The motivation for this is --- Once the cube is built, it needs to be >>>>>> served. >>>>>> >>>>>> The query looks somewhat like this: >>>>>> >>>>>> "Given ProductID=*, Year=2015, Fetch All Quantities Sold" >>>>>> >>>>>> "Given ProductID=XX, Fetch how much it has sold every Month" >>>>>> >>>>>> Find all entries that match K1=V1, K2=V2 >>>>>> >>>>>> This relieves us from lot of things - storage, REST API etc. and makes >>>>>> >>>>> the >>>> >>>>> cubes easily searchable. >>>>>> >>>>>> However, we don't do SQL/MDX on top of it. Tableau 9.1Beta is >>>>>> experimenting with Web-Data-Connector which we believe can be used for >>>>>> Visualization... Apart from that, we experimented with a few >>>>>> auto-generated Kibana dashboards which were just okay. But Kibana was >>>>>> >>>>> not >>>> >>>>> designed for Cubes and so it has its own limitations. >>>>>> >>>>>> Appreciate any feedback! >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Best, >>>>>> >>>>>> Sarnath >>>>>> I also think that it's a mini batch cubing. It's time to bring back >>>>>> >>>>> the >>>> >>>>> inverted index into roadmap. The inverted index will be the true >>>>>> >>>>> real-time >>>> >>>>> solution and can provide the low-level query capability on the raw >>>>>> data. >>>>>> >>>>>> >>>>>> Thanks! >>>>>> JiangXu >>>>>> >>>>>> >>>>>> ------------------ 原始邮件 ------------------ >>>>>> 发件人: "Henry Saputra";<[email protected]>; >>>>>> 发送时间: 2015年9月15日(星期二) 中午12:39 >>>>>> 收件人: "[email protected]"<[email protected] >>>>>> >; >>>>>> >>>>>> 主题: Re: Kylin Real time >>>>>> >>>>>> >>>>>> >>>>>> Ok, but that still seems like mini batch to me. >>>>>> >>>>>> There will be incremental updates on the existing cubes, but during >>>>>> that updates I suppose no queries will be ran against them? >>>>>> >>>>>> - Henry >>>>>> >>>>>> On Mon, Sep 14, 2015 at 12:33 AM, Li Yang <[email protected]> wrote: >>>>>> >>>>>>> Streaming OLAP provides Near-Realtime analysis where data delay can >>>>>>> >>>>>> be as >>>> >>>>> short as a few minutes. >>>>>>> >>>>>>> Traditional daily build allows user to analyze yesterday's data. If >>>>>>> increase the frequency to hourly, then user can analyze last hour's >>>>>>> >>>>>> data. >>>> >>>>> Further down the line, how about incremental build every 5 minutes >>>>>>> >>>>>> from a >>>> >>>>> streaming source? Then user can analyze data 5 minutes ago. That's >>>>>>> Streaming OLAP! >>>>>>> >>>>>>> On Mon, Sep 14, 2015 at 12:43 AM, Henry Saputra < >>>>>>> >>>>>> [email protected] >>>> >>>>> wrote: >>>>>>> >>>>>>> Hi Luke, >>>>>>>> >>>>>>>> Could you clarify again what is the streaming OLAP means here? >>>>>>>> >>>>>>>> By definition OLAP work with historical data. >>>>>>>> >>>>>>>> Maybe I missed it but was there any discussions or proposed design >>>>>>>> >>>>>>> for >>>> >>>>> it? >>>>>> >>>>>>> Thanks, >>>>>>>> >>>>>>>> - Henry >>>>>>>> >>>>>>>> On Monday, August 3, 2015, Luke Han <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi Siddharth, >>>>>>>>> Kylin's next majority release (0.8.x) will support Streaming >>>>>>>>> >>>>>>>> OLAP >>>> >>>>> which >>>>>>>> >>>>>>>>> will coming in Q4 since it still under development now, as Hongbin >>>>>>>>> mentioned above. >>>>>>>>> Could you please drop me a mail about your case? I would like >>>>>>>>> >>>>>>>> to >>>> >>>>> better understand your scenario to well manage coming features? >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Best Regards! >>>>>>>>> --------------------- >>>>>>>>> >>>>>>>>> Luke Han >>>>>>>>> >>>>>>>>> On Wed, Jul 29, 2015 at 2:08 PM, hongbin ma <[email protected] >>>>>>>>> <javascript:;>> wrote: >>>>>>>>> >>>>>>>>> For current 0.7 releases, you cannot. >>>>>>>>>> >>>>>>>>>> Real time data processing and querying will be added in 0.8 >>>>>>>>>> >>>>>>>>> release. >>>> >>>>> It >>>>>> >>>>>>> is >>>>>>>>> >>>>>>>>>> still under development and testing. We have achieved good >>>>>>>>>> >>>>>>>>> progress >>>> >>>>> on >>>>>> >>>>>>> it, >>>>>>>>> >>>>>>>>>> please wait for announcements. >>>>>>>>>> >>>>>>>>>> On Wed, Jul 29, 2015 at 2:02 PM, Siddharth Ubale < >>>>>>>>>> [email protected] <javascript:;>> wrote: >>>>>>>>>> >>>>>>>>>> Hi , >>>>>>>>>>> >>>>>>>>>>> I would like to ask whether Kylin can be used as a real time >>>>>>>>>>> >>>>>>>>>> querying >>>>>> >>>>>>> system? >>>>>>>>>>> The process of building a cube , makes it look like a batch >>>>>>>>>>> >>>>>>>>>> process >>>>>> >>>>>>> after >>>>>>>>> >>>>>>>>>> which the queries are with low latency.. however can >>>>>>>>>>> We get a real time idea of what the OLAP system's state is at >>>>>>>>>>> >>>>>>>>>> the >>>> >>>>> query >>>>>>>> >>>>>>>>> instance? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Siddharth >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Regards, >>>>>>>>>> >>>>>>>>>> *Bin Mahone | 马洪宾* >>>>>>>>>> Apache Kylin: http://kylin.io >>>>>>>>>> Github: https://github.com/binmahone >>>>>>>>>> >>>>>>>>>> >
