i'm starting a new project, which is pretty simple
it will be something like google analytics, but of course a bit smaller
what is required: web servers handle requests with a kind of generic key/value 
list
that requests will come at a pretty much high rate, lets say 1000 req per second
so far i guess, there will be no problem, to handle that, and to store it in 
the hbase, right?

on the other hand, of course, the data must be processed and monitored
that is required to be time based, i.e. i want to get statistics about a time 
period, lets say from day A to day B
that should wotk, BUT!
if i want to have a fast scan, i need to have the time stamp in the row key, 
right? other wise i well need to make a full scan, which can take a lot of 
time, if there is much data
but if i have the timestamp in the key, i will end up having hot regions, like 
described here 
http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
so what would be a better way, to have fast scans without hot regions?

cheers
andre

Reply via email to