Hi all,
    Right now I am aggregating our log data and populating tables based on how 
we want to query the data later. Currently I have eleven different aggregation 
tables and the date is part of the Row key. Since we usually slice our data by 
day I was wondering if it would be better to create aggregation table by date. 
I would no longer have to use the date as part of the stop/end row keys in a 
scan and it would be easier to prune old data. I would also guess there would 
be less contention on tables between the process that populates the table and 
the processes that query the table. One of the only problems I see, with my 
limited knowledge about HBase, is the tables will end up being rather small and 
would most likely end up on one region server.
        Long story short, is this a good idea?

Thanks

-Pete

Reply via email to