[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900725#comment-14900725
 ] 

Yu Li commented on HBASE-14431:
---

>From the HadoopQA report of HBASE-14448, I found TestFastFail failed with 
>below log:
{noformat}
2015-09-21 11:42:58,768 WARN  [AsyncRpcChannel-pool2-t17] 
logging.Slf4JLogger(151): An exception was thrown by 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel$2.operationComplete()
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.ipc.AsyncRpcClient.removeConnection(AsyncRpcClient.java:406)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.close(AsyncRpcChannel.java:537)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.retryOrClose(AsyncRpcChannel.java:300)
at 
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.access$200(AsyncRpcChannel.java:82)
{noformat}

Checking line 406 of AsyncRpcClient.java, we could find below changes in this 
JIRA:
{noformat}
-int connectionHashCode = connection.getConnectionHashCode();
+int connectionHashCode = connection.hashCode();
 synchronized (connections) {
   // we use address as cache key, so we should check here to prevent 
removing the
   // wrong connection
   AsyncRpcChannel connectionInPool = 
this.connections.get(connectionHashCode);
-  if (connectionInPool == connection) {
+ if (connectionInPool.equals(connection)) {
{noformat}
And line 406 is
{code}
if (connectionInPool.equals(connection)) {
{code}

I think we lack a null pointer check here, and attached is a straight addendum.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900932#comment-14900932
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.1 #669 (See 
[https://builds.apache.org/job/HBase-1.1/669/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
9ae6cead335c5afc298bd192820ecb7af928ab2c)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900972#comment-14900972
 ] 

Hudson commented on HBASE-14431:


SUCCESS: Integrated in HBase-1.2-IT #158 (See 
[https://builds.apache.org/job/HBase-1.2-IT/158/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
78c8c772db84d88dcecaafd6ba9c7f7e611cc091)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900995#comment-14900995
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.2 #187 (See 
[https://builds.apache.org/job/HBase-1.2/187/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
78c8c772db84d88dcecaafd6ba9c7f7e611cc091)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900977#comment-14900977
 ] 

Hudson commented on HBASE-14431:


SUCCESS: Integrated in HBase-1.3-IT #169 (See 
[https://builds.apache.org/job/HBase-1.3-IT/169/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
ca6c7f0a6857a5ac16be6a13c461e2aae0b51821)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14900742#comment-14900742
 ] 

Ted Yu commented on HBASE-14431:


Integrated addendum to related branches.

Thanks, Yu.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901342#comment-14901342
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-TRUNK #6824 (See 
[https://builds.apache.org/job/HBase-TRUNK/6824/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
86cf14889462b6947f921c41401a8f925fe2b3b6)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901221#comment-14901221
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.3 #188 (See 
[https://builds.apache.org/job/HBase-1.3/188/])
HBASE-14431 Addendum checks for null connectionInPool (Yu Li) (tedyu: rev 
ca6c7f0a6857a5ac16be6a13c461e2aae0b51821)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-addendum.patch, HBASE-14431-v2.patch, 
> HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877053#comment-14877053
 ] 

Ted Yu commented on HBASE-14431:


+1

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877141#comment-14877141
 ] 

Hadoop QA commented on HBASE-14431:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12761281/HBASE-14431-v2.patch
  against master branch at commit b0f52332651ecbb8af11557df5af3189c7283212.
  ATTACHMENT ID: 12761281

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15641//console

This message is automatically generated.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877218#comment-14877218
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.1 #667 (See 
[https://builds.apache.org/job/HBase-1.1/667/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
911c4342ae66447d51ec05e25eeb3b6c4d348a22)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877222#comment-14877222
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.3 #185 (See 
[https://builds.apache.org/job/HBase-1.3/185/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
88adccd553e4f70a0e5362d5ab5158f45d57d201)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877211#comment-14877211
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-TRUNK #6821 (See 
[https://builds.apache.org/job/HBase-TRUNK/6821/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
1545e1ed8d68b780dca49084cf5d8173481f72c0)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877206#comment-14877206
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.2-IT #156 (See 
[https://builds.apache.org/job/HBase-1.2-IT/156/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
388e948dfedab59cfe8fe8cf42001fec0eb32cd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877204#comment-14877204
 ] 

Hudson commented on HBASE-14431:


SUCCESS: Integrated in HBase-1.3-IT #167 (See 
[https://builds.apache.org/job/HBase-1.3-IT/167/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
88adccd553e4f70a0e5362d5ab5158f45d57d201)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14877214#comment-14877214
 ] 

Hudson commented on HBASE-14431:


FAILURE: Integrated in HBase-1.2 #185 (See 
[https://builds.apache.org/job/HBase-1.2/185/])
HBASE-14431 AsyncRpcClient#removeConnection() never removes connection from 
connections pool if server fails (Samir Ahmic) (tedyu: rev 
388e948dfedab59cfe8fe8cf42001fec0eb32cd3)
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcClient.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AsyncRpcChannel.java


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>
> Attachments: HBASE-14431-v2.patch, HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790514#comment-14790514
 ] 

Ted Yu commented on HBASE-14431:


lgtm

nit: connection.hashCode() is computed twice. You can save the return value in 
a local variable.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790813#comment-14790813
 ] 

Hadoop QA commented on HBASE-14431:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12756275/HBASE-14431.patch
  against master branch at commit d2e338181800ae3cef55ddca491901b65259dc7f.
  ATTACHMENT ID: 12756275

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0 2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFastFail

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15624//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15624//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15624//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15624//console

This message is automatically generated.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-16 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791053#comment-14791053
 ] 

Samir Ahmic commented on HBASE-14431:
-

This is interesting. I have run TestFastFail several times on two different 
machines and test never fails. I was using java 1.7.0_80 and 1.7.0_71




-  

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Attachments: HBASE-14431.patch
>
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-15 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745146#comment-14745146
 ] 

Heng Chen commented on HBASE-14431:
---

Is it a better choice to override {{hashCode}} method in {{AsyncRpcChannel}} ?


> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-15 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14746000#comment-14746000
 ] 

Samir Ahmic commented on HBASE-14431:
-

Looks like good idea [~chenheng]. Thanks for review. I will include it in patch 
after some more testing.
[~stack] thanks for pushing   
[HBASE-13337|https://issues.apache.org/jira/browse/HBASE-13337] on master branch

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-15 Thread Heng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745518#comment-14745518
 ] 

Heng Chen commented on HBASE-14431:
---

{quote}
Is it a better choice to override hashCode method in AsyncRpcChannel ?
I don't see that method. Can you elaborate ?
{quote}

In class {{AsyncRpcChannel}}, we override {{hashCode}} method just like 

{code}
  @Override
  public int hashCode() {
return getConnectionHashCode();
  }
{code}

Any concerns?

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-15 Thread Samir Ahmic (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745379#comment-14745379
 ] 

Samir Ahmic commented on HBASE-14431:
-

bg.  Is it a better choice to override hashCode method in AsyncRpcChannel ?
I don't see that method. Can you elaborate ?

BTW is there a reason why 
[HBASE-13337|https://issues.apache.org/jira/browse/HBASE-13337] is not 
committed to master branch?  Without it any testing where restart of servers is 
included will cause issues and only in master branch AsyncRpcClient is default 
client implementation.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745457#comment-14745457
 ] 

stack commented on HBASE-14431:
---

bq. BTW is there a reason why HBASE-13337 is not committed to master branch? 

None. Mistake on my part. Fixed. Thanks for noticing [~asamir]

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14431) AsyncRpcClient#removeConnection() never removes connection from connections pool if server fails

2015-09-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14743966#comment-14743966
 ] 

stack commented on HBASE-14431:
---

[~asamir] Nice debugging.

> AsyncRpcClient#removeConnection() never removes connection from connections 
> pool if server fails
> 
>
> Key: HBASE-14431
> URL: https://issues.apache.org/jira/browse/HBASE-14431
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.0.2, 1.1.2
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
>
> I was playing with master branch in distributed mode (3 rs + master + 
> backup_master) and notice strange behavior when i was testing this sequence 
> of events on single rs: /kill/start/run_balancer while client was writing 
> data to cluster (LoadTestTool).
> I have notice that LTT fails with following:
> {code}
> 2015-09-09 11:05:58,364 INFO  [main] client.AsyncProcess: #2, waiting for 
> some tasks to finish. Expected max=0, tasksInProgress=35
> Exception in thread "main" 
> org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 
> action: BindException: 1 time, 
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
>   at 
> org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1697)
>   at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:211)
> {code}
> After some digging  and adding some more logging in code i have notice that 
> following condition in  {code}AsyncRpcClient.removeConnection(AsyncRpcChannel 
> connection) {code} is never true:
> {code}
> if (connectionInPool == connection) {
> {code} 
> causing that  {code}AsyncRpcChannel{code} connection is never removed from 
> {code}connections{code} pool in case rs fails.
> After changing this condition to:
> {code}
> if (connectionInPool.address.equals(connection.address)) {
> {code}
> issue was resolved and client was removing failed server from connections 
> pool.
> I will attach patch after running some more tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)