I do that now but if they were in different table I could thread that out with one thread per table. I'm just worried I lose the advantage of HBase and a distributed system if the table ends up on one region server.
-Pete -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Jean-Daniel Cryans Sent: Wednesday, March 09, 2011 3:14 PM To: [email protected] Subject: Re: Many smaller tables vs one large table I guess it could be a good idea... do you need to be able to scan for data that's contained in more than one day? J-D On Wed, Mar 9, 2011 at 2:08 PM, Peter Haidinyak <[email protected]> wrote: > Hi all, > Right now I am aggregating our log data and populating tables based on how > we want to query the data later. Currently I have eleven different > aggregation tables and the date is part of the Row key. Since we usually > slice our data by day I was wondering if it would be better to create > aggregation table by date. I would no longer have to use the date as part of > the stop/end row keys in a scan and it would be easier to prune old data. I > would also guess there would be less contention on tables between the process > that populates the table and the processes that query the table. One of the > only problems I see, with my limited knowledge about HBase, is the tables > will end up being rather small and would most likely end up on one region > server. > Long story short, is this a good idea? > > Thanks > > -Pete >
