[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892681#comment-15892681 ] Samarth Jain commented on HBASE-17714: -- This eventually turned out to be an issue in the test. With HBase 1.1.4 or before, the test was passing because the RPC timeout wasn't honored which was fixed in 1.1.5. With 1.1.5 and beyond, this test started acting up since the actual timeout that it should instead have been overriding was the server side setting of hbase.client.scanner.timeout. > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891301#comment-15891301 ] Samarth Jain commented on HBASE-17714: -- Thanks for the investigation, [~apurtell]. I will make config changes in the test to increase the frequency of heartbeat checks and to see if enabling renewing leases would help. For the latter, my guess is that it wouldn't help because the call to renew lease is synchronized from the client side and would be blocked till scanner.next() returns. > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891272#comment-15891272 ] Andrew Purtell commented on HBASE-17714: The release notes on HBASE-13090 say {quote} To ensure that timeout checks do not occur too often (which would hurt the performance of scans), the configuration "hbase.cells.scanned.per.heartbeat.check" has been introduced. This configuration controls how often System.currentTimeMillis() is called to update the progress towards the time limit. Currently, the default value of this configuration value is 1. {quote} > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891268#comment-15891268 ] Andrew Purtell commented on HBASE-17714: e5c1a80 (HBASE-15645 hbase.rpc.timeout is not used in operations of HTable) is the commit that causes RenewLeaseIT to start failing. You have to apply HBASE-16420 (Fix source incompatibility of Table interface) after checking out e5c1a80 to get something that 4.x-HBase-1.1 will compile against. HBASE-15645 was shipped in 1.1.5, 1.2.2, and 1.3.0. What this change does is fix where the client was not actually honoring RPC timeouts prior to the change. [~samarthjain] are you sure RenewLeaseIT actually renews the lease before the RPC times out? The test sets a very short RPC timeout (2000ms) > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890951#comment-15890951 ] Andrew Purtell commented on HBASE-17714: Never mind, I found the link to the Phoenix JIRA. It is RenewLeaseIT > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken
[ https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890947#comment-15890947 ] Andrew Purtell commented on HBASE-17714: [~samarthjain] Which test? Given that range (thanks) I can bisect to find the commit that broke this. > Client heartbeats seems to be broken > > > Key: HBASE-17714 > URL: https://issues.apache.org/jira/browse/HBASE-17714 > Project: HBase > Issue Type: Bug >Reporter: Samarth Jain > > We have a test in Phoenix where we introduce an artificial sleep of 2 times > the RPC timeout in preScannerNext() hook of a co-processor. > {code} > public static class SleepingRegionObserver extends SimpleRegionObserver { > public SleepingRegionObserver() {} > > @Override > public boolean preScannerNext(final > ObserverContext c, > final InternalScanner s, final List results, > final int limit, final boolean hasMore) throws IOException { > try { > if (SLEEP_NOW && > c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME)) > { > Thread.sleep(RPC_TIMEOUT * 2); > } > } catch (InterruptedException e) { > throw new IOException(e); > } > return super.preScannerNext(c, s, results, limit, hasMore); > } > } > {code} > This test was passing fine till 1.1.3 but started failing sometime before > 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] > mentioned that we have client heartbeats enabled and that should prevent us > from running into issues like this. FYI, this test fails with 1.2.3 version > of HBase too. > CC [~apurtell], [~jamestaylor] -- This message was sent by Atlassian JIRA (v6.3.15#6346)