[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-4462: ------------------------- Attachment: unittest_that_shows_us_retrying_sockettimeout.txt Here is unit test that seems to show us retrying socketimeouts. > Properly treating SocketTimeoutException > ---------------------------------------- > > Key: HBASE-4462 > URL: https://issues.apache.org/jira/browse/HBASE-4462 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.4 > Reporter: Jean-Daniel Cryans > Assignee: ramkrishna.s.vasudevan > Fix For: 0.90.8 > > Attachments: HBASE-4462_0.90.x.patch, > unittest_that_shows_us_retrying_sockettimeout.txt > > > SocketTimeoutException is currently treated like any IOE inside of > HCM.getRegionServerWithRetries and I think this is a problem. This method > should only do retries in cases where we are pretty sure the operation will > complete, but with STE we already waited for (by default) 60 seconds and > nothing happened. > I found this while debugging Douglas Campbell's problem on the mailing list > where it seemed like he was using the same scanner from multiple threads, but > actually it was just the same client doing retries while the first run didn't > even finish yet (that's another problem). You could see the first scanner, > then up to two other handlers waiting for it to finish in order to run > (because of the synchronization on RegionScanner). > So what should we do? We could treat STE as a DoNotRetryException and let the > client deal with it, or we could retry only once. > There's also the option of having a different behavior for get/put/icv/scan, > the issue with operations that modify a cell is that you don't know if the > operation completed or not (same when a RS dies hard after completing let's > say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira