Jeff Cunningham created HBASE-11295:
---------------------------------------
Summary: Long running scan produces OutOfOrderScannerNextException
Key: HBASE-11295
URL: https://issues.apache.org/jira/browse/HBASE-11295
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.96.0
Reporter: Jeff Cunningham
Attachments: OutOfOrderScannerNextException.tar.gz
Attached Files:
HRegionServer.java - instramented from 0.96.1.1-cdh5.0.0
HBaseLeaseTimeoutIT.java - reproducing JUnit 4 test
WaitFilter.java - Scan filter (extends FilterBase) that overrides
filterRowKey() to sleep during invocation
SpliceFilter.proto - Protobuf defintiion for WaitFilter.java
OutOfOrderScann_InstramentedServer.log - instramented server log
Steps.txt - this note
Set up:
In HBaseLeaseTimeoutIT, create a scan, set the given filter (which sleeps in
overridden filterRowKey() method) and set it on the scan, and scan the table.
This is done in test client_0x0_server_150000x10().
Here's what I'm seeing (see also attached log):
A new request comes into server (ID 1940798815214593802 - RpcServer.handler=96)
and a RegionScanner is created for it, cached by ID, immediately looked up
again and cached RegionScannerHolder's nextCallSeq incremeted (now at 1).
The RegionScan thread goes to sleep in WaitFilter#filterRowKey().
A short (variable) period later, another request comes into the server (ID
8946109289649235722 - RpcServer.handler=98) and the same series of events
happen to this request.
At this point both RegionScanner threads are sleeping in
WaitFilter.filterRowKey(). After another period, the client retries another
scan request which thinks its next_call_seq is 0. However, HRegionServer's
cached RegionScannerHolder thinks the matching RegionScanner's nextCallSeq
should be 1.
--
This message was sent by Atlassian JIRA
(v6.2#6252)