Dear Buddies,

I need to re-calculate the entries in a hbase everyday, like let x = 0.9x
everyday, to make the time has impact on the entry values.

So I write a TableMapper to get the Entry, and recalculate the result, and
use Context.write(key, put) to put the update operation in context, and then
use a IdentityTableReducer to write that directly back the hbase. In order
to make the job done in a short time, I use the HRegionPartitioner to
increase the reducer number to 50.

But I have two doubts here:
1. It looks the partitioner will do a lots of shuffling, I am wondering why
it couldn't just do the put on the local region since the read and write on
the same entry should be on the same region, isn't it?

2. If the job failed for any reason(like timeout), the HBase might be in a
partial-updated status, is it?

Is there any suggestion that I could avoid these two problems?


Thanks.

Best wishes,
Stanley Xu

Reply via email to