One option is to make the filter less aggressive so that some data is returned within the current timeout.
Cheers On Aug 15, 2013, at 1:31 AM, Ishan Chhabra <[email protected]> wrote: > Hi, > > i have a mapreduce job that reads data from hbase. To minimize data > transfer, i have implemented a filter that aggressively filters out data to > be sent back. Now, I am running into a situation where the scanner doesn't > send back anything for the rpc.timwout value, and the client times out, > retries, and repeats. My tasks fail in the initialize phase itself because > it gets stuck in this loop for 10 minutes and then gives up. > > I am currently running with hbase.rpc.timeout and > hbase.regionserver.lease.period as 120s. I can increase this further, but > want to understand the cons of doing that first. > > Also, is there any other way of getting around this? > > -- > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.**
