Re: update hbase data realtime and query it

Li Yang Mon, 04 Apr 2016 06:03:47 -0700

Realtime support is on Kylin's roadmap. We can collaborate on this if you
are interested.


The idea is simple. Say current Kylin can do 5 minutes micro batch, then
only need a realtime storage to catch up-to 5 minutes latest data (which
comes after the last batch). Query will hit both realtime storage and cube
storage. In your attempt, the realtime storage is still HBase, but I prefer
it be a new separate table. The realtime storage need to expose interface
for query, which is straightforward given we've done it once.


On Thu, Mar 31, 2016 at 11:54 AM, zeLiu <[email protected]> wrote:

> thanks hongbin,
>
> It is true that, as you say, the datas must be pre aggregate , or the same
> key will cover each other.
>
> I just refer to the MapReduce code of kylin, add data to the HBase in real
> time, and not a very good idea
>
> The reason for this is that we are all doing two products, a real-time and
> an offline.
> Their front UI are the same, in the past we are to write real-time data
> into
> the mysql,storage offline data use kylin, and this will need to develop new
> interfaces for mysql.
> But if both real-time and off-line are written to the kylin, we can only
> develop an interface for UI.
>
>
> I did a test, the same data, the delay is much smaller than the mysql, The
> average delay is about 6 ms.
> I didn't use the dictionary, because I'm worried that if a new value is not
> found in the dictionary, it will affect the accuracy of the data.
>
> Whether there is a better solution?
>
> the plug-in code: https://github.com/zeliu/kylin-storm-plugin
>
> thanks
>
> --
> View this message in context:
> http://apache-kylin.74782.x6.nabble.com/update-hbase-data-realtime-and-query-it-tp3959p4019.html
> Sent from the Apache Kylin mailing list archive at Nabble.com.
>

Re: update hbase data realtime and query it

Reply via email to