I do that now but if they were in different table I could thread that out with 
one thread per table. I'm just worried I lose the advantage of HBase and a 
distributed system if the table ends up on one region server. 

-Pete

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Jean-Daniel 
Cryans
Sent: Wednesday, March 09, 2011 3:14 PM
To: [email protected]
Subject: Re: Many smaller tables vs one large table

I guess it could be a good idea... do you need to be able to scan for
data that's contained in more than one day?

J-D

On Wed, Mar 9, 2011 at 2:08 PM, Peter Haidinyak <[email protected]> wrote:
> Hi all,
>    Right now I am aggregating our log data and populating tables based on how 
> we want to query the data later. Currently I have eleven different 
> aggregation tables and the date is part of the Row key. Since we usually 
> slice our data by day I was wondering if it would be better to create 
> aggregation table by date. I would no longer have to use the date as part of 
> the stop/end row keys in a scan and it would be easier to prune old data. I 
> would also guess there would be less contention on tables between the process 
> that populates the table and the processes that query the table. One of the 
> only problems I see, with my limited knowledge about HBase, is the tables 
> will end up being rather small and would most likely end up on one region 
> server.
>        Long story short, is this a good idea?
>
> Thanks
>
> -Pete
>

Reply via email to