Hi, You could take a look into *OpenTSDB* . I think they are addressing some of the issues that you mention here.
Thanks On Tue, May 15, 2012 at 10:09 AM, Leon Mergen <[email protected]> wrote: > Hello all, > > We are currently orienting on HBase as a possible way to store our log data > in a structured way, and I want to verify a few things I was not able to > find online. Specifically, what we are trying to achieve: > > * be able to quickly search for logs within a specific time range; > * limit the amount of maps in our mapreduce jobs to only those areas we're > interested in. > > As I understand it, there is a tradeoff: > > * if you use a timestamp as a split key, be prepared for a tradeoff: a > single region server can become a hotspot. This is bad when writing data at > a high load; > * if we do not have the timestamp as the first key of the splitkeys, a > MapReduce job will have to do a TableScan and have a huge amount of maps. > > Is there a known solution / workaround for this problem that people have > used? Since our timespan queries are usually limited based on days, we were > considering adding a new table for each day, but that looked like a bit of > an ugly hack. > > Any ideas / suggestions about this ? > > Regards, > > Leon Mergen > -- Thanks & Regards Himanish
