Re: hbase schema design

Silvio Di gregorio Tue, 10 Dec 2013 21:39:48 -0800

Hi
These are a characteristic time Series data. You must prefix rowkey TO
avoid workload TO only one regione server.
<something not monotonic variable>_timestamp.
Il 11/dic/2013 00:35 "Steven Wu" <[email protected]> ha scritto:


>
>
>
>
> Hi
>
>    I am very new to Hbase, still self-learning and do POC for our current
> project.  I have a question about the row key design.
>
> I have created  big table (called asset table), it  has more than 50M
> records. Each asset has unique key (let's call it asset_key)
>
> This table receives continuous updates from up-stream system (around 100
> updates per min). The clients would like to receive real-time updates from
> us. At current system, we have two indexed columns (asset_key, update_ts)
> on
> asset DB table So the clients could query the db table based on update_ts
> for lastest updates. However the db now become a bottleneck
>
> So we are wondering how could we achieve the same function in Hbase. I
> don't
> want to use scan filter function on the column as it will tiger full table
> scan (correct me if I am wrong on this).
>
>
>
> the best thing I could think of is to have timestamp built in to rowkey.
> However, we still have a requirement, that client would like query data
> based on unique asset_key
>
>
>
> The usercase we have is the system has to support concurrently more than
> 1000 uses to query latest update from this table at lowest possible
> latency.
> Also ,  clients would like query data based on unique asset_key  to
> retrieve
> records from our system
>
>
>
>
>
> Really appreciate your though on this.
>
>
>
>
>
>
>
> Regards,
>
>
>
>
>
> Steven
>
>
>
>
>
>
>
>

Re: hbase schema design

Reply via email to