[jira] Commented: (HBASE-3162) Add TimeRange support into Increment to optimize for counters that are partitioned on time

HBase Review Board (JIRA) Sun, 31 Oct 2010 14:06:48 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926779#action_12926779
 ]

HBase Review Board commented on HBASE-3162:
-------------------------------------------

Message from: "Jonathan Gray" <[email protected]>

bq.  On 2010-10-31 13:32:44, khemani wrote:
bq.  > The timestamp that we put in the column-qualifier to create hourly 
counters need not be in sync with the KV timestamp. This is because there are 
times when the log stream falls behind and we might be updating couple of hours 
old counters. The time-range that we provide has to be dynamically determined 
based on the current log-stream delay.
bq.  > 
bq.  > 
bq.  > This will really work well if along with hourly counters we also have 
hourly store files. If everything gets compacted into a single store file then 
this change doesn't help much.
bq.  >

Yeah, if you aren't doing all of your increments at the same time as the stamps 
they represent, you'll need to modify the TimeRange.

Something like:  [min,max) -> [minStampInPartition,Long.MAX_VALUE) where 
minStampInPartition is the lowest timestamp possible for the time bucket you 
are incrementing.

As we begin to grow a large amount of historical data, it will be important 
that our compaction policy eventually just archives old data and it does not 
get included in further compactions.  This TimeRange functionality will ensure 
they don't impact performance on new data.

- Jonathan

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/1132/#review1731
-----------------------------------------------------------

> Add TimeRange support into Increment to optimize for counters that are 
> partitioned on time
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3162
>                 URL: https://issues.apache.org/jira/browse/HBASE-3162
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>            Assignee: Jonathan Gray
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3162-v1.patch
>
>
> In many use cases of increments, a given counter is only incremented during a 
> specific window of time (ie. the counters are partitioned/sharded by time).
> With this kind of schema, you are constantly creating new counters.  When a 
> new counter is "created" (incremented the first time) you will always end up 
> looking at a block from every file in the region because no previous value 
> will exist.  However, with the new TimeRange optimizations that skip files if 
> they don't contain values of the TimeRange you're interested in, we could 
> utilize that information to optimize the Get within the increment.
> This would be optional and an addition to the Increment class.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3162) Add TimeRange support into Increment to optimize for counters that are partitioned on time

Reply via email to