[
https://issues.apache.org/jira/browse/HBASE-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011203#comment-13011203
]
Sean Sechrist commented on HBASE-3686:
--------------------------------------
I did a little more testing and it turns out this problem isn't limited to the
misconfiguration.
You'll also lose rows if you kill -9 a region server in the middle of scan. In
HTable.ClientScanner.next(), there's this skipFirst boolean that is supposed to
skip the first row that was "already let out on a previous invocation". But
instead of just skipping the first row,
getConnection().getRegionServerWithRetries(callable) is called an extra time,
which will skip [caching] rows.
So I think fixing it to only skip 1 row will also fixing the problem if there's
a misconfiguration, so sending the timeout to the server won't be needed.
> Scanner timeout on RegionServer but Client won't know what happened
> -------------------------------------------------------------------
>
> Key: HBASE-3686
> URL: https://issues.apache.org/jira/browse/HBASE-3686
> Project: HBase
> Issue Type: Bug
> Components: client
> Affects Versions: 0.89.20100924
> Reporter: Sean Sechrist
> Priority: Minor
>
> This can cause rows to be lost from a scan.
> See this thread where the issue was brought up:
> http://search-hadoop.com/m/xITBQ136xGJ1
> If hbase.regionserver.lease.period is higher on the client than the server we
> can get this series of events:
> 1. Client is scanning along happily, and does something slow.
> 2. Scanner times out on region server
> 3. Client calls HTable.ClientScanner.next()
> 4. The region server throws an UnknownScannerException
> 5. Client catches exception and sees that it's not longer then it's
> hbase.regionserver.lease.period config, so it doesn't throw a
> ScannerTimeoutException. Instead, it treats it like a NSRE.
> Right now the workaround is to make sure the configs are consistent.
> A possible fix would be to use whatever the region server's scanner timeout
> is, rather than the local one.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira