[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-02 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892681#comment-15892681
 ] 

Samarth Jain commented on HBASE-17714:
--

This eventually turned out to be an issue in the test. With HBase 1.1.4 or 
before, the test was passing because the RPC timeout wasn't honored which was 
fixed in 1.1.5. With 1.1.5 and beyond, this test started acting up since the 
actual timeout that it should instead have been overriding was the server side 
setting of hbase.client.scanner.timeout.

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891301#comment-15891301
 ] 

Samarth Jain commented on HBASE-17714:
--

Thanks for the investigation, [~apurtell]. I will make config changes in the 
test to increase the frequency of heartbeat checks and to see if enabling 
renewing leases would help. For the latter, my guess is that it wouldn't help 
because the call to renew lease is synchronized from the client side and would 
be blocked till scanner.next() returns. 

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891272#comment-15891272
 ] 

Andrew Purtell commented on HBASE-17714:


The release notes on HBASE-13090 say

{quote}
To ensure that timeout checks do not occur too often (which would hurt the 
performance of scans), the configuration 
"hbase.cells.scanned.per.heartbeat.check" has been introduced. This 
configuration controls how often System.currentTimeMillis() is called to update 
the progress towards the time limit. Currently, the default value of this 
configuration value is 1.
{quote}

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891268#comment-15891268
 ] 

Andrew Purtell commented on HBASE-17714:


e5c1a80 (HBASE-15645 hbase.rpc.timeout is not used in operations of HTable) is 
the commit that causes RenewLeaseIT to start failing. You have to apply 
HBASE-16420 (Fix source incompatibility of Table interface) after checking out 
e5c1a80 to get something that 4.x-HBase-1.1 will compile against.  

HBASE-15645 was shipped in 1.1.5, 1.2.2, and 1.3.0.  

What this change does is fix where the client was not actually honoring RPC 
timeouts prior to the change. [~samarthjain] are you sure RenewLeaseIT actually 
renews the lease before the RPC times out? The test sets a very short RPC 
timeout (2000ms)


> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890951#comment-15890951
 ] 

Andrew Purtell commented on HBASE-17714:


Never mind, I found the link to the Phoenix JIRA. It is RenewLeaseIT

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890947#comment-15890947
 ] 

Andrew Purtell commented on HBASE-17714:


[~samarthjain] Which test? Given that range (thanks) I can bisect to find the 
commit that broke this. 

> Client heartbeats seems to be broken
> 
>
> Key: HBASE-17714
> URL: https://issues.apache.org/jira/browse/HBASE-17714
> Project: HBase
>  Issue Type: Bug
>Reporter: Samarth Jain
>
> We have a test in Phoenix where we introduce an artificial sleep of 2 times 
> the RPC timeout in preScannerNext() hook of a co-processor. 
> {code}
>  public static class SleepingRegionObserver extends SimpleRegionObserver {
> public SleepingRegionObserver() {}
> 
> @Override
> public boolean preScannerNext(final 
> ObserverContext c,
> final InternalScanner s, final List results,
> final int limit, final boolean hasMore) throws IOException {
> try {
> if (SLEEP_NOW && 
> c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
>  {
> Thread.sleep(RPC_TIMEOUT * 2);
> }
> } catch (InterruptedException e) {
> throw new IOException(e);
> }
> return super.preScannerNext(c, s, results, limit, hasMore);
> }
> }
> {code}
> This test was passing fine till 1.1.3 but started failing sometime before 
> 1.1.9 with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] 
> mentioned that we have client heartbeats enabled and that should prevent us 
> from running into issues like this. FYI, this test fails with 1.2.3 version 
> of HBase too.
> CC [~apurtell], [~jamestaylor]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)