Re: Query on OutOfOrderScannerNextException

Arun Mishra Sat, 06 Jun 2015 16:03:07 -0700

Thanks Vladimir. I am using option 2 as a short term fix for now. I will 
definitely look into key design.


Regards,
Arun.

> On Jun 6, 2015, at 3:18 PM, Vladimir Rodionov <[email protected]> wrote:
> 
> The scanner fails at the very beginning. The reason is because they need a
> very few rows from a large file and HBase needs
> to fill RPC buffer (which is 100 rows, yes?) before it can return first
> batch. This takes more than 60 sec and scanner fails (do not ask me why its
> not the timeout exception)
> 
> 1. HBASE-13090 will help (can be back ported I presume to 1.0 and 0.98.x)
> 2. Smaller region size will help
> 3. Smaller  hbase.client.scanner.caching will help
> 4. Larger hbase.client.scanner.timeout.period will help
> 5. Better data store design (rowkeys) is preferred.
> 
> Too many options to choose from.
> 
> -Vlad
> 
> 
>> On Sat, Jun 6, 2015 at 3:04 PM, Arun Mishra <[email protected]> wrote:
>> 
>> Thanks TED.
>> 
>> Regards,
>> Arun.
>> 
>>> On Jun 6, 2015, at 2:34 PM, Ted Yu <[email protected]> wrote:
>>> 
>>> HBASE-13090 'Progress heartbeats for long running scanners' solves the
>>> problem you faced.
>>> 
>>> It is in the 1.1.0 release.
>>> 
>>> FYI
>>> 
>>>> On Sat, Jun 6, 2015 at 12:54 PM, Arun Mishra <[email protected]> wrote:
>>>> 
>>>> Hello,
>>>> 
>>>> I have a query on OutOfOrderScannerNextException. I am using hbase
>> 0.98.6
>>>> with 45 nodes.
>>>> 
>>>> I have a mapreduce job which scan 1 table for last 1 day worth data
>> using
>>>> timerange. It has been running fine for months without any failure. But
>>>> last couple of days it has been failing with below exception. I have
>> traced
>>>> the failure to a single region. This region has 1 store and 1 hfile of
>>>> 5+GB. What we realized was that, we were writing some bulk data, which
>> used
>>>> to land on this region. After we stopped writing this data, this region
>> has
>>>> been receiving very few writes per day.
>>>> 
>>>> When mapreduce job runs, it creates a map task for this region and that
>>>> task fails with OutOfOrderScannerNextException. I was able to reproduce
>>>> this error by running a scan command with same start/stop row and
>> timerange
>>>> option. Finally, we split this region to be small enough for scan
>> command
>>>> to work.
>>>> 
>>>> My query is if there is any option, apart from increasing the timeout,
>>>> which can solve this use case? I am thinking of a use case where data
>> comes
>>>> in for 3 days a week in bulk and then nothing for next 3 days. Kind of
>>>> creating a data hole in region.
>>>> My understanding is that I am hit with this error because I have big
>> store
>>>> files and timerange scan is reading entire file even though it contains
>>>> very few rowkeys for that timerange.
>>>> 
>>>> hbase.client.scanner.caching = 100
>>>> hbase.client.scanner.timeout.period = 60s
>>>> 
>>>> scan 'dummytable',{ STARTROW=>'dummyrowkey-start',
>>>> STOPROW=>'dummyrowkey-end', LIMIT=>1000,
>>>> TIMERANGE=>[1433462400000,1433548800000]}
>>>> ROW                                           COLUMN+CELL
>>>> 
>>>> ERROR:
>> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
>>>> Expected nextCallSeq: 1 But the nextCallSeq got from client: 0;
>>>> request=scanner_id: 33648 number_of_rows: 100 close_scanner: false
>>>> next_call_seq: 0
>>>> at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)
>>>> at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>>>> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
>>>> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>>>> at
>> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>>>> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>> 
>>>> 
>>>> Regards,
>>>> Arun
>>

Re: Query on OutOfOrderScannerNextException

Reply via email to