HBASE-13090 'Progress heartbeats for long running scanners' solves the problem you faced.
It is in the 1.1.0 release. FYI On Sat, Jun 6, 2015 at 12:54 PM, Arun Mishra <[email protected]> wrote: > Hello, > > I have a query on OutOfOrderScannerNextException. I am using hbase 0.98.6 > with 45 nodes. > > I have a mapreduce job which scan 1 table for last 1 day worth data using > timerange. It has been running fine for months without any failure. But > last couple of days it has been failing with below exception. I have traced > the failure to a single region. This region has 1 store and 1 hfile of > 5+GB. What we realized was that, we were writing some bulk data, which used > to land on this region. After we stopped writing this data, this region has > been receiving very few writes per day. > > When mapreduce job runs, it creates a map task for this region and that > task fails with OutOfOrderScannerNextException. I was able to reproduce > this error by running a scan command with same start/stop row and timerange > option. Finally, we split this region to be small enough for scan command > to work. > > My query is if there is any option, apart from increasing the timeout, > which can solve this use case? I am thinking of a use case where data comes > in for 3 days a week in bulk and then nothing for next 3 days. Kind of > creating a data hole in region. > My understanding is that I am hit with this error because I have big store > files and timerange scan is reading entire file even though it contains > very few rowkeys for that timerange. > > hbase.client.scanner.caching = 100 > hbase.client.scanner.timeout.period = 60s > > scan 'dummytable',{ STARTROW=>'dummyrowkey-start', > STOPROW=>'dummyrowkey-end', LIMIT=>1000, > TIMERANGE=>[1433462400000,1433548800000]} > ROW COLUMN+CELL > > ERROR: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: > Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; > request=scanner_id: 33648 number_of_rows: 100 close_scanner: false > next_call_seq: 0 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > at java.lang.Thread.run(Thread.java:745) > > > Regards, > Arun
