Dear Buddies, I need to re-calculate the entries in a hbase everyday, like let x = 0.9x everyday, to make the time has impact on the entry values.
So I write a TableMapper to get the Entry, and recalculate the result, and use Context.write(key, put) to put the update operation in context, and then use a IdentityTableReducer to write that directly back the hbase. In order to make the job done in a short time, I use the HRegionPartitioner to increase the reducer number to 50. But I have two doubts here: 1. It looks the partitioner will do a lots of shuffling, I am wondering why it couldn't just do the put on the local region since the read and write on the same entry should be on the same region, isn't it? 2. If the job failed for any reason(like timeout), the HBase might be in a partial-updated status, is it? Is there any suggestion that I could avoid these two problems? Thanks. Best wishes, Stanley Xu
