[jira] [Commented] (HBASE-4835) ConcurrentModificationException out of ZKConfig.makeZKProps

2011-11-21 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154360#comment-13154360
 ] 

Andrew Purtell commented on HBASE-4835:
---

Or synchronize access to the Configuration object in makeZKProps

 ConcurrentModificationException out of ZKConfig.makeZKProps
 ---

 Key: HBASE-4835
 URL: https://issues.apache.org/jira/browse/HBASE-4835
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: HBASE-4835.patch


 Mikhail reported this from a five-node, three-RS cluster test:
 {code}
 2011-11-21 01:30:15,188 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 machine_name,60020,1321867814890: Initialization of RS failed. Hence 
 aborting RS.
 java.util.ConcurrentModificationException
 at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
 at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1042)
 at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:75)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:245)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:144)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:124)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1262)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:568)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:559)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:183)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.init(CatalogTracker.java:177)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:575)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:534)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:642)
 at java.lang.Thread.run(Thread.java:619)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4835) ConcurrentModificationException out of ZKConfig.makeZKProps

2011-11-21 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154366#comment-13154366
 ] 

Mikhail Bautin commented on HBASE-4835:
---

@Andrew: thanks for the fix!
The simple approach with copying the configuration sounds good -- I presume we 
don't create too many unique ZooKeeperWatchers in a single JVM.

Alternatively, we could somehow get an immutable snapshot of the 
configuration's key set and iterate that instead of the configuration itself.

 ConcurrentModificationException out of ZKConfig.makeZKProps
 ---

 Key: HBASE-4835
 URL: https://issues.apache.org/jira/browse/HBASE-4835
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: HBASE-4835.patch


 Mikhail reported this from a five-node, three-RS cluster test:
 {code}
 2011-11-21 01:30:15,188 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 machine_name,60020,1321867814890: Initialization of RS failed. Hence 
 aborting RS.
 java.util.ConcurrentModificationException
 at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
 at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1042)
 at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:75)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:245)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:144)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:124)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1262)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:568)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:559)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:183)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.init(CatalogTracker.java:177)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:575)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:534)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:642)
 at java.lang.Thread.run(Thread.java:619)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4835) ConcurrentModificationException out of ZKConfig.makeZKProps

2011-11-21 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154790#comment-13154790
 ] 

stack commented on HBASE-4835:
--

There's a few per JVM:

{code}
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:this.zooKeeper = 
new ZooKeeperWatcher(conf, MASTER + : + isa.getPort(), this, true);
src/main/java/org/apache/hadoop/hbase/master/HMaster.java:this.zooKeeper = 
new ZooKeeperWatcher(conf, MASTER + :
src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:
this.zooKeeper = new ZooKeeperWatcher(conf, REGIONSERVER + : +
src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java:
  new ZooKeeperWatcher(this.conf, replicationLogCleaner, null);
src/main/java/org/apache/hadoop/hbase/replication/ReplicationPeer.java:zkw 
= new ZooKeeperWatcher(conf,
src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java:new 
ZooKeeperWatcher(conf, catalogtracker-on- + connection.toString(),
src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java:  
this.zooKeeper = new ZooKeeperWatcher(conf, hconnection, this);
{code}

When you say:

bq. The concern I have about cloning in makeZKProps is the same hashmap 
iteration will happen for that.

You are afraid that the clone is not a deep clone so we'll just be iterating 
the same map of configuration items?

That seems valid enough though why won't we have same issue if we make a clone 
in ZKW?

I'm wary cloning Configuration.   I did that 'fixing' our issues w/ tests where 
we were sharing an HConnection and I manufactured the issue where we had too 
many connections against the zk ensemble.

Odd we see this now.


 ConcurrentModificationException out of ZKConfig.makeZKProps
 ---

 Key: HBASE-4835
 URL: https://issues.apache.org/jira/browse/HBASE-4835
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: HBASE-4835.patch


 Mikhail reported this from a five-node, three-RS cluster test:
 {code}
 2011-11-21 01:30:15,188 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 machine_name,60020,1321867814890: Initialization of RS failed. Hence 
 aborting RS.
 java.util.ConcurrentModificationException
 at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
 at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1042)
 at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:75)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:245)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:144)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:124)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1262)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:568)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:559)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:183)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.init(CatalogTracker.java:177)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:575)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:534)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:642)
 at java.lang.Thread.run(Thread.java:619)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4835) ConcurrentModificationException out of ZKConfig.makeZKProps

2011-11-21 Thread Ted Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13154879#comment-13154879
 ] 

Ted Yu commented on HBASE-4835:
---

How about cloning Configuration in ZKW ctor and assign it to a local variable ?
We pass this local variable to ZKConfig.getZKQuorumServersString().

 ConcurrentModificationException out of ZKConfig.makeZKProps
 ---

 Key: HBASE-4835
 URL: https://issues.apache.org/jira/browse/HBASE-4835
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Attachments: HBASE-4835.patch


 Mikhail reported this from a five-node, three-RS cluster test:
 {code}
 2011-11-21 01:30:15,188 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 machine_name,60020,1321867814890: Initialization of RS failed. Hence 
 aborting RS.
 java.util.ConcurrentModificationException
 at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
 at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1042)
 at org.apache.hadoop.hbase.zookeeper.ZKConfig.makeZKProps(ZKConfig.java:75)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKConfig.getZKQuorumServersString(ZKConfig.java:245)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:144)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:124)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1262)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:568)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:559)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:183)
 at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.init(CatalogTracker.java:177)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:575)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:534)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:642)
 at java.lang.Thread.run(Thread.java:619)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira