What's the value of 'zookeeper.session.timeout' ? Maybe you can tune it higher.
On Sat, Jan 22, 2011 at 3:13 AM, Wojciech Langiewicz <[email protected]>wrote: > Hi, > I have re-run test with 2 or more mappers and looked into logs more > closely: The mapreduce job has finished correctly, some map attempts are > killed, but eventually all of mappers finish. And there's no rule which > mappers on which servers fail (re-run tests multiple times). > > So I suspect that this is not classpath problem, maybe there are settings > that limit number of connections, because before MasterNotRunning I get this > in logs: > > 2011-01-22 12:06:43,411 INFO org.apache.zookeeper.ZooKeeper: Initiating > client connection, connectString=hd-master:2181 sessionTimeout=60000 > watcher=org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper@3b5e234c > 2011-01-22 12:06:43,483 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server hd-master/10.6.75.212:2181 > 2011-01-22 12:06:43,484 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to hd-master/10.6.75.212:2181, initiating session > 2011-01-22 12:06:43,488 INFO org.apache.zookeeper.ClientCnxn: Unable to > read additional data from server sessionid 0x0, likely server has closed > socket, closing socket connection and attempting reconnect > 2011-01-22 12:06:43,605 INFO > org.apache.hadoop.hbase.client.HConnectionManager$TableServers: getMaster > attempt 0 of 10 failed; retrying after sleep of 1000 > java.io.IOException: > org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase/master > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:481) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readMasterAddressOrThrow(ZooKeeperWrapper.java:377) > at > org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getMaster(HConnectionManager.java:381) > > at > org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:78) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.testSetup(PerformanceEvaluation.java:745) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:764) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1097) > at > org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:446) > at > org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:399) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) > at org.apache.hadoop.mapred.Child$4.run(Child.java:217) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) > at org.apache.hadoop.mapred.Child.main(Child.java:211) > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase/master > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.readAddressOrThrow(ZooKeeperWrapper.java:477) > ... 16 more > > > On 22.01.2011 01:36, Stack wrote: > >> Then its odd that PE fails. Can you figure difference between the two >> environments? Perhaps your MR jobs are fat jars that include the conf >> and all dependencies whereas PE is dump and expects the dependencies >> and conf on CLASSPATH? >> >> St.Ack >> >> On Fri, Jan 21, 2011 at 11:56 AM, Wojciech Langiewicz >> <[email protected]> wrote: >> >>> Hi, >>> I have other mapreduce tasks that are running on this cluster and using >>> HBase and they are working correctly. All my servers have the same >>> configuration. >>> >>> -- >>> Wojciech Langiewicz >>> >>> On 21.01.2011 19:20, Stack wrote: >>> >>>> >>>> When clients are> 1, then PE tries to run a mapreduce job to host the >>>> loading clients. >>>> >>>> Is it possible that the client out in MR task is trying to connect to >>>> wrong location? Perhaps the HBase conf dir is not available to the >>>> running task? Have you seen >>>> >>>> >>>> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#package_description >>>> ? >>>> Perhaps this will help? >>>> >>>> St.Ack >>>> >>>> On Fri, Jan 21, 2011 at 8:12 AM, Wojciech Langiewicz >>>> <[email protected]> wrote: >>>> >>>>> >>>>> Hello >>>>> I have problem with running HBase performance tests from >>>>> org.apache.hadoop.hbase.PerformanceEvaluation package. I'm using >>>>> version >>>>> from CDH3, >>>>> Tests are ok when argument nclients is 1, but in case of greater >>>>> number, >>>>> after mappers reaching 100% I get this exception (I didn't test all >>>>> tests, >>>>> but all of tested by me failed, 'scan' and 'randomRead' fail for sure): >>>>> >>>>> 11/01/21 17:05:19 INFO mapred.JobClient: Task Id : >>>>> attempt_201101211442_0005_m_000000_0, Status : FAILED >>>>> org.apache.hadoop.hbase.MasterNotRunningException >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getMaster(HConnectionManager.java:416) >>>>> at >>>>> org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:78) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.PerformanceEvaluation$Test.testSetup(PerformanceEvaluation.java:745) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:764) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1097) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:446) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.hbase.PerformanceEvaluation$EvaluationMapTask.map(PerformanceEvaluation.java:399) >>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) >>>>> at >>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315) >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:217) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>>> at >>>>> >>>>> >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:211) >>>>> >>>>> Do you have any ideas how to solve this? >>>>> >>>>> -- >>>>> Wojciech Langiewicz >>>>> >>>>> >>> >>> >
