[
https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ramkrishna.s.vasudevan reassigned HBASE-4462:
---------------------------------------------
Assignee: ramkrishna.s.vasudevan
> Properly treating SocketTimeoutException
> ----------------------------------------
>
> Key: HBASE-4462
> URL: https://issues.apache.org/jira/browse/HBASE-4462
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.4
> Reporter: Jean-Daniel Cryans
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.5
>
>
> SocketTimeoutException is currently treated like any IOE inside of
> HCM.getRegionServerWithRetries and I think this is a problem. This method
> should only do retries in cases where we are pretty sure the operation will
> complete, but with STE we already waited for (by default) 60 seconds and
> nothing happened.
> I found this while debugging Douglas Campbell's problem on the mailing list
> where it seemed like he was using the same scanner from multiple threads, but
> actually it was just the same client doing retries while the first run didn't
> even finish yet (that's another problem). You could see the first scanner,
> then up to two other handlers waiting for it to finish in order to run
> (because of the synchronization on RegionScanner).
> So what should we do? We could treat STE as a DoNotRetryException and let the
> client deal with it, or we could retry only once.
> There's also the option of having a different behavior for get/put/icv/scan,
> the issue with operations that modify a cell is that you don't know if the
> operation completed or not (same when a RS dies hard after completing let's
> say a Put but just before returning to the client).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira