If your data size is big enough to warrant 3 tables, go for it. This would be the case where there are really lots of entries for user#type.
Best Regards, Sonal Crux: Reporting for HBase <https://github.com/sonalgoyal/crux> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Sun, Aug 21, 2011 at 11:09 PM, Mark <[email protected]> wrote: > Almost all use cases require type.. ie > > Retrieve all searches performed by user 'foo': scan "history", {STARTROW > => "search/foo"} > Retrieve all product views performed by user 'foo': scan "history", > {STARTROW => "view/foo"} > > > On 8/21/11 10:25 AM, Sonal Goyal wrote: > >> Hi Mark, >> >> When you say that your use case does not require searching across multiple >> types, what do you mean? Do you have cases when you search with type? >> >> Best Regards, >> Sonal >> Crux: Reporting for >> HBase<https://github.com/**sonalgoyal/crux<https://github.com/sonalgoyal/crux> >> > >> Nube Technologies<http://www.**nubetech.co <http://www.nubetech.co>> >> >> <http://in.linkedin.com/in/**sonalgoyal<http://in.linkedin.com/in/sonalgoyal> >> > >> >> >> >> >> >> >> On Sun, Aug 21, 2011 at 9:29 PM, Mark<[email protected]**> >> wrote: >> >> We are logging all user actions into hbase. These actions include >>> searches, >>> product views and clicks. >>> >>> We are currently storing them in one table with row keys like so: >>> "#{type}/#{user}/#{time}", where type is either click, search, view and >>> user >>> is the current user logged in. Obviously using this method lead to region >>> hot spotting as the start of each key is fairly static. This got me to >>> thinking on what alternatives ways I could model this type of data and I >>> was >>> hoping I could get some suggestions from the community. >>> >>> Which would be more advisable? >>> >>> 1) Keep the current all logs go to one table pattern that is describe >>> above. >>> 2) Keep the current all logs go to one table pattern that is describe >>> above >>> but switch the type and user fields which would lead to more randomized >>> keys >>> thus reducing hot spots >>> 3) Create separate tables for each type of log we are saving... ie have >>> search table, click table, view table. >>> >>> Our use case does not require us searching across multiple types so I'm >>> leaning towards #3 now but I was wondering if there were any cons to >>> using >>> this method? Is it worse to have more tables than less? >>> >>> Thanks for help >>> >>> -M >>> >>> >>> >>> >>> >>>
