Hi there-

I just submitted a patch to the book here...

https://issues.apache.org/jira/browse/HBASE-4110

You can see the contents in the patch.



On 7/15/11 3:48 PM, "large data" <[email protected]> wrote:

>thank Doug!
>
>Writing to hbase would be driven by asyn events (rather than M/R jobs)
>fired
>on user activity so higher 'put' throughput is not a strictly a
>requirement
>neither is exceptional read performance. TTL would be around 6 months so I
>don't envision scan data ranges > 6 months.
>
>Can you send along any leads?
>
>thanks
>
>On Fri, Jul 15, 2011 at 12:40 PM, Doug Meil
><[email protected]>wrote:
>
>>
>> Hi there-
>>
>> There was an almost identical question on this subject yesterday and it
>> comes up regularly.  A lot of this depends on how many users you have,
>> data ingest rate, and how dynamic your reports/queries need to be.
>>
>> One option is creating a table that acts as a secondary index, another
>>is
>> creating a summary table of activity via a MR job.  These are common
>> options, but not the only ones.
>>
>> Much depends on your specific requirements, though.  There isn't a
>> one-size-fits-all answer.
>>
>>
>> I'll update the book with something on this topic.
>>
>>
>> Doug
>>
>>
>> On 7/15/11 2:30 PM, "large data" <[email protected]> wrote:
>>
>> >Designing date range table where I track the userId, the activity and
>>the
>> >day activity was performed.
>> >
>> >Key format is <userId activityId YYDDMM> (using space as separator) to
>> >avoid
>> >hot-spots by having the date as last part of the key.
>> >
>> >Now I can easily find the activities done by user 'X' using
>>PrefixFilter.
>> >
>> >But how do I go about finding user activities between date ranges?
>> >
>> >thanks
>>
>>

Reply via email to