Turn off iptables and see if it resolves errors. You need to open all ports that hbase and zookeeper need to connect each node.
Alex. -----Original Message----- From: kaveh minooie <[email protected]> To: user <[email protected]> Sent: Fri, Feb 22, 2013 12:56 pm Subject: Re: issue with nutch-gora+hbase+zookeeper no here are the logs, master: 2013-02-22 12:03:21,136 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=10 regions=3 average=0.3 mostloaded=1 leastloaded=0 2013-02-22 12:03:21,141 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) 2013-02-22 12:08:21,135 DEBUG org.apache.hadoop.hbase.master.LoadBalancer: Server information: d1r1n12.prod.plutoz.com,60020,1361563095874=1, d1r1n16.prod.plutoz.com,60020,1361563095878 =0, d1r1n17.prod.plutoz.com,60020,1361563095900=0, d1r1n13.prod.plutoz.com,60020,1361563095880=0, d1r1n14.prod.plutoz.com,60020,1361563095886=0, d1r1n10.prod.plutoz.com,60020,1361563095853=1, d1r1n20.prod.plutoz.com,60020,1361563095890=1, d1r1n15.prod.plutoz.com,60020,1361563095870=0, d1r1n21.prod.plutoz.com,60020,1361563095888=0, d1r1n19.prod.plutoz.com,60020,1361563095879=0 2013-02-22 12:08:21,135 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing. servers=10 regions=3 average=0.3 mostloaded=1 leastloaded=0 2013-02-22 12:08:21,142 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s) one of the regionservers: 2013-02-22 12:03:17,199 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=?%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=?%, evictions=0, evicted=0, evictedPerRun=NaN 2013-02-22 12:08:17,198 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=?%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=?%, evictions=0, evicted=0, evictedPerRun=NaN 2013-02-22 12:13:17,198 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=?%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=?%, evictions=0, evicted=0, evictedPerRun=NaN another regionserver: 2013-02-22 12:03:17,279 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=�%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN 2013-02-22 12:08:17,278 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=�%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN 2013-02-22 12:13:17,278 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=6.51 MB, free=786.84 MB, max=793.35 MB, blocks=0, accesses=0, hits=0, hitRatio=�%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN On 02/22/2013 12:49 PM, [email protected] wrote: > Hi, > > Do you see any errors in HMaster and Regionserver logs? > > Alex. > > > > > > > > -----Original Message----- > From: kaveh minooie <[email protected]> > To: user <[email protected]>; user <[email protected]>; user <[email protected]> > Sent: Fri, Feb 22, 2013 12:31 pm > Subject: issue with nutch-gora+hbase+zookeeper > > > Hi everyone > > I am having this problem for couple of days now and would appreciate any > idea or suggestion that any one might have. I am using nutch 2.x with > hbase. due to nutch requirement I need to use older version of hbase (I > am using 0.90.6 over hadoop 1.1.1 with 10 nodes with zookeepr 3.5.0 > [trunk] ) > > hbase seems to be running fine, thou I appreciate if someone can show me > how I can actually test it systematically. but I seem to be able to > create and read data from hbase, but when I run any nutch command > something very similar to this happens as soon as the job starts > running. this is for example, the output of nutch inject > > 13/02/22 12:07:30 INFO mapred.JobClient: map 0% reduce 0% > 13/02/22 12:07:52 INFO mapred.JobClient: Task Id : > attempt_201302191325_0013_m_000000_0, Status : FAILED > org.apache.gora.util.GoraException: > org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to > connect to ZooKeeper but the connection closes immediately. This could > be a sign that the server has too many connections (30 is the default). > Consider inspecting your ZK server logs for that error and then make > sure you are reusing HBaseConfiguration as often as you can. See > HTable's javadoc for more information. > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:118) > at > org.apache.gora.mapreduce.GoraOutputFormat.getRecordWriter(GoraOutputFormat.java:88) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:628) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:753) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase > is able to connect to ZooKeeper but the connection closes immediately. > This could be a sign that the server has too many connections (30 is the > default). Consider inspecting your ZK server logs for that error and > then make sure you are reusing HBaseConfiguration as often as you can. > See HTable's javadoc for more information. > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:156) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1265) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:526) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:516) > at > org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:173) > at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:93) > at > org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108) > at > org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) > at > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) > ... 10 more > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1237) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1265) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:931) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:134) > ... 18 more > > > > Now I know that I am not running out of connection. for one thing I have > increased the number of connection to 200 in zoo.cfg, and also here is > what is in the zookeeper log file around that time: > > 2013-02-22 12:07:27,704 [myid:] - INFO > [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@289] > > - Accepted socket connection from /127.0.0.1:55073 > 2013-02-22 12:07:27,707 [myid:] - INFO > [NIOWorkerThread-3:ZooKeeperServer@810] - Client attempting to establish > new session at /127.0.0.1:55073 > 2013-02-22 12:07:27,720 [myid:] - INFO > [SyncThread:0:ZooKeeperServer@566] - Established session > 0x13d037b8e6b0016 with negotiated timeout 40000 for client /127.0.0.1:55073 > 2013-02-22 12:07:27,945 [myid:] - INFO > [NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@289] > > - Accepted socket connection from /127.0.0.1:55075 > 2013-02-22 12:07:27,946 [myid:] - INFO > [NIOWorkerThread-2:ZooKeeperServer@810] - Client attempting to establish > new session at /127.0.0.1:55075 > 2013-02-22 12:07:27,953 [myid:] - INFO > [SyncThread:0:ZooKeeperServer@566] - Established session > 0x13d037b8e6b0017 with negotiated timeout 40000 for client /127.0.0.1:55075 > 2013-02-22 12:07:28,010 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@533] - Processed session termination for > sessionid: 0x13d037b8e6b0017 > 2013-02-22 12:07:28,011 [myid:] - INFO > [NIOWorkerThread-6:NIOServerCnxn@1000] - Closed socket connection for > client /127.0.0.1:55075 which had sessionid 0x13d037b8e6b0017 > 2013-02-22 12:08:14,005 [myid:] - WARN > [NIOWorkerThread-7:NIOServerCnxn@362] - Unable to read additional data > from client sessionid 0x13d037b8e6b0016, likely client has closed socket > 2013-02-22 12:08:14,005 [myid:] - INFO > [NIOWorkerThread-7:NIOServerCnxn@1000] - Closed socket connection for > client /127.0.0.1:55073 which had sessionid 0x13d037b8e6b0016 > 2013-02-22 12:08:48,000 [myid:] - INFO > [SessionTracker:ZooKeeperServer@304] - Expiring session > 0x13d037b8e6b0016, timeout of 40000ms exceeded > 2013-02-22 12:08:48,001 [myid:] - INFO [ProcessThread(sid:0 > cport:-1)::PrepRequestProcessor@533] - Processed session termination for > sessionid: 0x13d037b8e6b0016 > > > I also don't think that it is a heartbeat or GC related issue since > there is really no load at all on these servers right now. I know this > is a hybrid problem involving three separate product (nutch, hbase, > zookeeper) but I am asking this in all the mailing list. Also I am gonna > say it agian in order to avoid confusion with the similar problems in > older versions that are supposedly solved. I am using hbase 0.90.6 and > zookeeper 3.5.0 (commit 46b565e6) with nutch 2.x (commit f02dcf625 ) > both are either the latest or very recent updates. > > if anyone has any idea what is happening here I very much like to hear > that. > > thanks, > > -- Kaveh Minooie
