[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6649: - Fix Version/s: (was: 0.95.0) 0.94.2 Fix up after bulk move overwrote some 0.94.2 fix versions w/ 0.95.0 (Noticed by Lars Hofhansl) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Priority: Blocker Fix For: 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling-1.patch, 6649-fix-io-exception-handling-1-trunk.patch, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6649: - Priority: Blocker (was: Major) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Priority: Blocker Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling-1.patch, 6649-fix-io-exception-handling-1-trunk.patch, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: 6649-fix-io-exception-handling-1.patch Attaching a patch with the 'position' fix. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling-1.patch, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: 6649-fix-io-exception-handling-1-trunk.patch Same patch, for trunk. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling-1.patch, 6649-fix-io-exception-handling-1-trunk.patch, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: 6649-fix-io-exception-handling.patch This patch demonstrates what I commented with earlier. Please have a look. I could make a method which has the getPosition() and next().. but I wanted to check on whether folks agree with the fix first. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: 6649-fix-io-exception-handling.patch Attaching a more complete fix (for 0.94) [~jdcryans], could you please try this patch out in your cluster. The more I think about it, the more I am beginning to believe that setting the position so that it always points to a valid location is the fix here... [~lhofhansl] I have seen dataloss issues (via the unit test) without this patch.. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling.patch, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: (was: 6649-fix-io-exception-handling.patch) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6649: - Fix Version/s: 0.94.2 0.96.0 I'd also like this in 0.94. The 0.92 will probably just apply cleanly. If not I'll make one. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-1.patch, 6649-2.txt, 6649-trunk.patch, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-6649: --- Attachment: 6649-0.92.patch 6649-trunk.patch Don't mind adding a few comments around the exception handling.. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-trunk.patch, 6649-trunk.patch, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6649: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, 0.92, and 0.94. Thanks for the reviews lads and DD for the patch. [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-trunk.patch, 6649-trunk.patch, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6649: - Attachment: 6649.txt Here is what I applied. Includes Ted's suggested logging. I applied this same patch to 0.94 and 0.92 w/ -p1 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] --- Key: HBASE-6649 URL: https://issues.apache.org/jira/browse/HBASE-6649 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0, 0.92.3, 0.94.2 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover [Jenkins].html Have seen it twice in the recent past: http://bit.ly/MPCykB http://bit.ly/O79Dq7 .. Looking briefly at the logs hints at a pattern - in both the failed test instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira