[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711540#comment-13711540
 ] 

Hudson commented on HBASE-8535:
---

FAILURE: Integrated in HBase-0.94-security #213 (See 
[https://builds.apache.org/job/HBase-0.94-security/213/])
HBASE-8969 Backport HBASE-8535+HBASE-8586 TestHCM#testDeleteForZKConnLeak 
enhancement to 0.94 (jxiang: rev 1504215)
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-07-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711602#comment-13711602
 ] 

Hudson commented on HBASE-8535:
---

SUCCESS: Integrated in HBase-0.94 #1059 (See 
[https://builds.apache.org/job/HBase-0.94/1059/])
HBASE-8969 Backport HBASE-8535+HBASE-8586 TestHCM#testDeleteForZKConnLeak 
enhancement to 0.94 (jxiang: rev 1504215)
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659266#comment-13659266
 ] 

Hudson commented on HBASE-8535:
---

Integrated in hbase-0.95-on-hadoop2 #101 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/101/])
HBASE-8535 Test for zk leak does not account for unsynchronized access to 
zk watcher (Revision 1482904)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658324#comment-13658324
 ] 

Eric Yu commented on HBASE-8535:


@stack Yes, this change is meant to correct a flaw in the 
testDeleteForZKConnLeak where it will occassionally fail for versions 0.95 and 
0.98. On internalClose, HConnection implementation marks itself as closed and 
then closes watcher. So the test fails if it checks after the connection is 
marked as closed but before the watcher is closed. The test doesn't have access 
to the object used internally by HConnection to synchronize so I added a delay 
that double checks whether the watcher is closed if watcher was still open. I 
didn't always delay every time because I didn't want to slow down the test too 
much.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658565#comment-13658565
 ] 

Hudson commented on HBASE-8535:
---

Integrated in HBase-TRUNK #4121 (See 
[https://builds.apache.org/job/HBase-TRUNK/4121/])
HBASE-8535 Test for zk leak does not account for unsynchronized access to 
zk watcher (Revision 1482905)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658578#comment-13658578
 ] 

stack commented on HBASE-8535:
--

[~ericyu] I see that TestHCM just hung on the trunk build 
https://builds.apache.org/job/HBase-TRUNK/4121/

Do you think it related at all going by the output:

https://builds.apache.org/job/HBase-TRUNK/4121/artifact/trunk/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.client.TestHCM-output.txt

I think I'll dig in anyways but asking just in case you have insight...


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658593#comment-13658593
 ] 

Eric Yu commented on HBASE-8535:


[~saint@gmail.com] Does appear to be related as that test is the one that 
uses the method that generates this log 
2013-05-15 16:08:55,270 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster

I will take a look as well.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658612#comment-13658612
 ] 

stack commented on HBASE-8535:
--

Hmm... the hang seems to originate in the test this patch fixes.  Search for 
'before: client.TestHCM#testDeleteForZKConnLeak' in the output above.

The test falls into a loop about here:

{code}
2013-05-15 16:07:44,351 WARN  [pool-1-thread-1] 
zookeeper.RecoverableZooKeeper(237): Possibly transient ZooKeeper, 
quorum=localhost:54737, 
exception=org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
2013-05-15 16:07:44,351 INFO  [pool-1-thread-1] util.RetryCounter(54): Sleeping 
2000ms before retry #1...
2013-05-15 16:07:44,351 DEBUG [pool-1-thread-1-EventThread] 
zookeeper.ZooKeeperWatcher(377): hconnection-0x10ea988-0x13ea8f23825000e 
connected
2013-05-15 16:07:44,352 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,354 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,355 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,356 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,359 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,360 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,362 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,364 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,365 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,366 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,366 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,367 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,368 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,369 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster
2013-05-15 16:07:44,370 INFO  [test-hcm-delete-pool-267-thread-1] 
client.HConnectionManager$HConnectionImplementation(609): ClusterId is 
default-cluster

{code}

If you want to punt to me that is fine Eric, just say, and I'll open a new 
issue to dig in on it (minimally we should never spew as we do above the same 
INFO log every miliisecond.


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658636#comment-13658636
 ] 

Eric Yu commented on HBASE-8535:


testDeleteForZKConnLeak is using a hconnection to create a HTable while another 
thread is calling deleteStaleConnection because previous if a hconnection gets 
closed at the wrong time while it is being used by constructor of HTable, 
HTable constructor can restart zk connection on a closed hconnection. I 
couldn't figure out how to reproduce the problem precisely so the test just 
calls both a lot in two separate threads and hopes that if a leak exists, it 
will happen during the test. However, when the conn gets closed while being 
used for HTable constructor, there is an exception and a sleep and a retry 
three times (2s, 4s, 8s?) somewhere in the constructor that makes the test 
quite slow. I will see if I can set connection property to not do this retry.  

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658656#comment-13658656
 ] 

Eric Yu commented on HBASE-8535:


I'm not sure why, but the exception doesn't occur as frequently when I run the 
test locally. For some reason, the exception occurs much more frequently for ci 
(util.RetryCounter logs much more frequently). Anyway, I will see if modifying 
RecoverableZookeeper retry count and retry wait (zookeeper.recovery.retry, 
zookeeper.recovery.retry.intervalmill) in the config used by the test will help.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658660#comment-13658660
 ] 

stack commented on HBASE-8535:
--

[~ericyu] Does the test make sense?  Deleting a 'stale' connection though it is 
in use?  And why no protection against a stale delete coming in on a connection 
in use?  I was going to look at this...

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658662#comment-13658662
 ] 

stack commented on HBASE-8535:
--

[~ericyu] nvm  The test is yours IIRC.  If you want, I can take this one np.  I 
should fix that spew at least.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658682#comment-13658682
 ] 

stack commented on HBASE-8535:
--

[~ericyu] It fails for me the odd time locally; taking a look see too... 

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658693#comment-13658693
 ] 

stack commented on HBASE-8535:
--

[~ericyu] hmm... so it is just slow -- especially if connection gets trounced 
at a particularly bad time.  Maybe we don't have to cycle 50 times.  And we 
could remove the dumb log line:

{code}
diff --git 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
index 763b79f..6455c07 100644
--- 
a/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
+++ 
b/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
@@ -604,9 +604,8 @@ public class HConnectionManager {
   this.clusterId = this.registry.getClusterId();
   if (clusterId == null) {
 clusterId = HConstants.CLUSTER_ID_DEFAULT;
+LOG.debug(clusterid came back null, using default  + clusterId);
   }
-
-  LOG.info(ClusterId is  + clusterId);
 }
{code} 

... to get rid of the spew.



 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658709#comment-13658709
 ] 

Eric Yu commented on HBASE-8535:


Also need to set hbase.rpc.timeout for this test to get it to finish in a 
decent amount of time. [~saint@gmail.com] I assumed a stale connection was 
one that had old values and anything using it would expect to see exceptions 
anyways. If not, probably wouldn't be too big a change to just remove stale 
hconnection from hcm map and wait until every client has released it to close 
the hconnection.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658745#comment-13658745
 ] 

stack commented on HBASE-8535:
--

[~ericyu] 'stale' is a judgement made on a Connection by a HConnectionManager 
when close is called -- the connection is being aborted/abandoned so all bets 
are off ... crash it down and clean up all resources overriding all usual 
cleanup mechanisms to do stuff like force cleanup of the zk ignoring the 
reference counting the connection is doing so connections are reused rather 
than estabilished every time.  Would NOT calling deleteStaleConnection be a 
better test, instead just calling a close o the connection in your background 
thread?  Would it repro what you were seeing from your app; i.e. if all 
connections closed, then should be no outstanding zks?  Do you need an api to 
see if zk still up?

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658780#comment-13658780
 ] 

Eric Yu commented on HBASE-8535:


[~saint@gmail.com]Calling just close doesn't cause zk leak in previous 
version as hcm would only wipe out the connection if it isn't allocated to any 
clients. Problem only happened when the connection is wiped out while other 
clients are still using i.e. connection user would start creating HTable, 
verify connection is open, closeStale caller would mark connection as closed 
and shutdown zk, connection user thread would call getZKWatcher which would 
restart zk). The overwhelming majority of cases our application calls close, 
but sometimes if app thinks the connection is no longer usable, our app will 
call closeStale.

I don't think it is necessary to add to the api just for a test case. Ideally, 
client of connection doesn't need to know anything about zk.




 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Eric Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658791#comment-13658791
 ] 

Eric Yu commented on HBASE-8535:


[~saint@gmail.com] I have values for config properties that I believe will 
make the test run faster when exceptions occur. Should I submit a new version 
of patch for this jira or did you want a separate jira and also include the 
change from debug to info level?

{noformat}
Index: hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
===
--- hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java  
(revision 1483058)
+++ hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java  
 (working copy)
@@ -809,7 +814,10 @@
   @Test
   public void testDeleteForZKConnLeak() throws Exception {
 TEST_UTIL.createTable(TABLE_NAME4, FAM_NAM);
-final Configuration config = TEST_UTIL.getConfiguration();
+final Configuration config = 
HBaseConfiguration.create(TEST_UTIL.getConfiguration());
+config.setInt(zookeeper.recovery.retry, 1);
+config.setInt(zookeeper.recovery.retry.intervalmill, 1000);
+config.setInt(hbase.rpc.timeout, 2000);

 ThreadPoolExecutor pool = new ThreadPoolExecutor(1, 10,
   5, TimeUnit.SECONDS,
{noformat}

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658958#comment-13658958
 ] 

stack commented on HBASE-8535:
--

I committed the addendum.  Lets see if it cures the hanging.  Will do another 
issue if I see it hang again.  Thanks for help [~ericyu]

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658962#comment-13658962
 ] 

Hudson commented on HBASE-8535:
---

Integrated in hbase-0.95 #196 (See 
[https://builds.apache.org/job/hbase-0.95/196/])
HBASE-8535 Test for zk leak does not account for unsynchronized access to 
zk watcher (Revision 1482904)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13659232#comment-13659232
 ] 

Hudson commented on HBASE-8535:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #532 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/532/])
HBASE-8535 Test for zk leak does not account for unsynchronized access to 
zk watcher (Revision 1482905)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java


 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
Assignee: stack
 Fix For: 0.98.0, 0.95.1

 Attachments: 8535.addendum.txt, 8535.addendumv2.txt, HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13657714#comment-13657714
 ] 

stack commented on HBASE-8535:
--

[~ericyu] Patch looks good but I am not too clear on what it does -- seems like 
it moves the close out a paren...  Can you saw what is changed?  Thanks.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13658074#comment-13658074
 ] 

stack commented on HBASE-8535:
--

Is the patch here to fix this kind of failure [~ericyu] ? 
https://builds.apache.org/job/HBase-TRUNK/4118/testReport/org.apache.hadoop.hbase.client/TestHCM/testDeleteForZKConnLeak/

Thanks.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-8535) Test for zk leak does not account for unsynchronized access to zk watcher

2013-05-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13656082#comment-13656082
 ] 

Hadoop QA commented on HBASE-8535:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12582938/HBASE-8535.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5652//console

This message is automatically generated.

 Test for zk leak does not account for unsynchronized access to zk watcher
 -

 Key: HBASE-8535
 URL: https://issues.apache.org/jira/browse/HBASE-8535
 Project: HBase
  Issue Type: Test
  Components: Client
Affects Versions: 0.98.0, 0.95.1
Reporter: Eric Yu
 Fix For: 0.98.0, 0.95.1

 Attachments: HBASE-8535.patch


 Test can detect a live zk connection in a closed hconnection because it does 
 not accesses the zk watcher in a synchronized manner. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira