[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4832: -- Fix Version/s: 0.94.0 Release Note: (was: This incorporates nkeywal's earlier patch to this JIRA, and allows TestRegionServerCoprocessortWithAbort() to work with it. It changes the test to use a Zookeeper watcher in a separate thread to watch for the regionserver to abort. (This is also what is currently done with TestMasterCoprocessorWithAbort()). In my testing, repeated iterations (30+) of TestRegionServerCoprocessortWithAbort() succeed.) Hadoop Flags: Reviewed This incorporates nkeywal's earlier patch to this JIRA, and allows TestRegionServerCoprocessortWithAbort() to work with it. It changes the test to use a Zookeeper watcher in a separate thread to watch for the regionserver to abort. (This is also what is currently done with TestMasterCoprocessorWithAbort()). In Eugene's testing, repeated iterations (30+) of TestRegionServerCoprocessortWithAbort() succeed. TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Fix For: 0.94.0 Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4832: -- Resolution: Fixed Status: Resolved (was: Patch Available) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Fix For: 0.94.0 Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HBASE-4832: - Attachment: HBASE-4832.patch -Removes (timeout=3) from @Test per nkeywal's suggestion. -Add LOG.debug() concerning where interrupt occurs. TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HBASE-4832: - Attachment: HBASE-4832.patch git diff --no-prefix TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HBASE-4832: - Release Note: This incorporates nkeywal's earlier patch to this JIRA, and allows TestRegionServerCoprocessortWithAbort() to work with it. It changes the test to use a Zookeeper watcher in a separate thread to watch for the regionserver to abort. (This is also what is currently done with TestMasterCoprocessorWithAbort()). In my testing, repeated iterations (30+) of TestRegionServerCoprocessortWithAbort() succeed. Status: Patch Available (was: Open) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832_trunk_hregionserver.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HBASE-4832: - Attachment: HBASE-4832.patch TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832_trunk_hregionserver.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort.testExceptionFromCoprocessorDuringPut(TestRegionServerCoprocessorExceptionWithAbort.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-4832: -- Attachment: 4832-timeout.txt Patch which stores timeout value in a static variable. TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort.testExceptionFromCoprocessorDuringPut(TestRegionServerCoprocessorExceptionWithAbort.java:84) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koontz updated HBASE-4832: - Attachment: HBASE-4832.patch New version of the patch: parameterize test timeout (thanks to Ted Yu) and use this timeout amount in Thread.sleep() near end of testExceptionFromCoprocessorDuringPut(). TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Assignee: Eugene Koontz Priority: Minor Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at
[jira] [Updated] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
[ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-4832: --- Attachment: 4832_trunk_hregionserver.patch 4832_trunk_hregionserver.patch contains the fix on HRegionServer which makes the coprocessor test fails. TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast --- Key: HBASE-4832 URL: https://issues.apache.org/jira/browse/HBASE-4832 Project: HBase Issue Type: Bug Components: coprocessors, test Affects Versions: 0.94.0 Reporter: nkeywal Priority: Minor Attachments: 4832_trunk_hregionserver.patch The current implementation of HRegionServer#stop is {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); synchronized (this) { // Wakes run() if it is sleeping notifyAll(); // FindBugs NN_NAKED_NOTIFY } } {noformat} The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: {noformat} public void stop(final String msg) { this.stopped = true; LOG.info(STOPPED: + msg); // Wakes run() if it is sleeping sleeper.skipSleepCycle(); } {noformat} Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. It likely because the code does no expect the region server to stop that fast. The exception is: {noformat} testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec ERROR! java.lang.Exception: test timed out after 3 milliseconds at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.init(Throwable.java:196) at java.lang.Exception.init(Exception.java:41) at java.lang.InterruptedException.init(InterruptedException.java:48) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) at org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort.testExceptionFromCoprocessorDuringPut(TestRegionServerCoprocessorExceptionWithAbort.java:84) at