[jira] [Commented] (HBASE-11295) Long running scan produces OutOfOrderScannerNextException

Anoop Sam John (JIRA) Thu, 08 Jan 2015 16:38:12 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270294#comment-14270294
 ]


Anoop Sam John commented on HBASE-11295:
----------------------------------------

bq.The problem is the client and server disagree about the request timing out, 
twice.
If we try a 3rd time, then also the client side will time out? The same data 
set at server and same filter...
With the above change in code, I fear we will get the issue in HBASE-5974 ie. 
some data will not get scanned. (Data miss)
For such a filter which will filter out most of the data, can we increase the 
client timeout?
After HBASE-5974  another jira added the restriction part that only one more 
time we retry after getting a OOScannerNextExp.  Do you feel we need try for 
some more times  [~apurtell]?

> Long running scan produces OutOfOrderScannerNextException
> ---------------------------------------------------------
>
>                 Key: HBASE-11295
>                 URL: https://issues.apache.org/jira/browse/HBASE-11295
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: Jeff Cunningham
>            Assignee: Andrew Purtell
>            Priority: Critical
>             Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
>         Attachments: OutOfOrderScannerNextException.tar.gz
>
>
> Attached Files:
> HRegionServer.java - instramented from 0.96.1.1-cdh5.0.0
> HBaseLeaseTimeoutIT.java - reproducing JUnit 4 test
> WaitFilter.java - Scan filter (extends FilterBase) that overrides 
> filterRowKey() to sleep during invocation
> SpliceFilter.proto - Protobuf defintiion for WaitFilter.java
> OutOfOrderScann_InstramentedServer.log - instramented server log
> Steps.txt - this note
> Set up:
> In HBaseLeaseTimeoutIT, create a scan, set the given filter (which sleeps in 
> overridden filterRowKey() method) and set it on the scan, and scan the table.
> This is done in test client_0x0_server_150000x10().
> Here's what I'm seeing (see also attached log):
> A new request comes into server (ID 1940798815214593802 - 
> RpcServer.handler=96) and a RegionScanner is created for it, cached by ID, 
> immediately looked up again and cached RegionScannerHolder's nextCallSeq 
> incremeted (now at 1).
> The RegionScan thread goes to sleep in WaitFilter#filterRowKey().
> A short (variable) period later, another request comes into the server (ID 
> 8946109289649235722 - RpcServer.handler=98) and the same series of events 
> happen to this request.
> At this point both RegionScanner threads are sleeping in 
> WaitFilter.filterRowKey(). After another period, the client retries another 
> scan request which thinks its next_call_seq is 0.  However, HRegionServer's 
> cached RegionScannerHolder thinks the matching RegionScanner's nextCallSeq 
> should be 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-11295) Long running scan produces OutOfOrderScannerNextException

Reply via email to