Awesome! That works. Thank you Avery. Vishal
On Tue, Aug 7, 2012 at 6:39 PM, Avery Ching <[email protected]> wrote: > Hi Vishal, > > This issue is due to Hadoop limiting the job counters. > > org.apache.hadoop.mapred.Counters$CountersExceededException: Error: > Exceeded limits on number of counters - Counters=120 Limit=120 > > You can either disable the Giraph counters per superstep > (-Dgiraph.useSuperstepCounters=false), or you can try to increase the > counter limit. > > Hope that helps, > > Avery > > > On 8/7/12 6:05 PM, Vishal Patel wrote: > > Hi, > > Using a pseudo-distributed hadoop install I am able to run the connected > components example perfectly. However I keep getting the following error on > a real cluster. After several attempts on the in-house hadoop install I > decided to move to Amazon EC2's mapreduce. > > > I created a small 2 node cluster with, > > Namenode: m1.small > Slaves (1): m1.2xlarge (I tried m1.large : but it complained about the > heap size) > > I downloaded giraph, compiled with maven 3.0+ and ran the Connected > Component example. I keep getting the same error I saw previously- am I > missing something? Do I have to configure ZooKeeper? > > *Mapper 0 returns the following:* > java.lang.Throwable: Child Error > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) > Caused by: java.io.IOException: Task process exit with nonzero status of 1. > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) > > *Mapper 1 and 2 return the following error:* > > java.lang.IllegalStateException: startSuperstep: KeeperException getting > assignments > at > org.apache.giraph.graph.BspServiceWorker.startSuperstep(BspServiceWorker.java:929) > at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:550) > at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:681) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:375) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for > /_hadoopBsp/job_201208080037_0001/_applicationAttemptsDir/0/_superstepDir/99/_partitionAssignments > at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837) > at > org.apache.giraph.graph.BspServiceWorker.startSuperstep(BspServiceWorker.java:910) > ... 9 more > > I also checked to make sure the job-configuration to make sure the > *mapred.tasktracker.map.tasks > = 4. * > > *Also, last couple lines from **attempt_201208080037_0002_m_000000_0:* > * > * > 2012-08-08 00:59:03,262 INFO org.apache.giraph.graph.MasterThread > (org.apache.giraph.graph.MasterThread): masterThread: Coordination of > superstep 95 took 0.116 seconds ended > 2012-08-08 00:59:03,265 INFO > org.apache.giraph.graph.partition.PartitionBalancer > (org.apache.giraph.graph.MasterThread): balancePartitionsAcrossWorkers: > Using algorithm stati > 2012-08-08 00:59:03,265 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Vertices - > Mean: 10000, Min: Work > 2012-08-08 00:59:03,265 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Edges - > Mean: 19996, Min: Worker( > 2012-08-08 00:59:03,269 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): barrierOnWorkerList: 0 out of 1 > workers finished on superstep 96 > 2012-08-08 00:59:03,279 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): aggregateWorkerStats: Aggregation > found (vtx=10000,finVtx=10000, > 2012-08-08 00:59:03,280 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): coordinateSuperstep: Cleaning up > old Superstep /_hadoopBsp/job_2 > 2012-08-08 00:59:03,321 INFO org.apache.giraph.graph.MasterThread > (org.apache.giraph.graph.MasterThread): masterThread: Coordination of > superstep 96 took 0.059 seconds ended > 2012-08-08 00:59:03,387 INFO > org.apache.giraph.graph.partition.PartitionBalancer > (org.apache.giraph.graph.MasterThread): balancePartitionsAcrossWorkers: > Using algorithm stati > 2012-08-08 00:59:03,387 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Vertices - > Mean: 10000, Min: Work > 2012-08-08 00:59:03,387 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Edges - > Mean: 19996, Min: Worker( > 2012-08-08 00:59:03,398 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): barrierOnWorkerList: 0 out of 1 > workers finished on superstep 97 > 2012-08-08 00:59:03,411 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): aggregateWorkerStats: Aggregation > found (vtx=10000,finVtx=10000, > 2012-08-08 00:59:03,426 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): coordinateSuperstep: Cleaning up > old Superstep /_hadoopBsp/job_2 > 2012-08-08 00:59:03,451 INFO org.apache.giraph.graph.MasterThread > (org.apache.giraph.graph.MasterThread): masterThread: Coordination of > superstep 97 took 0.13 seconds ended w > 2012-08-08 00:59:03,454 INFO > org.apache.giraph.graph.partition.PartitionBalancer > (org.apache.giraph.graph.MasterThread): balancePartitionsAcrossWorkers: > Using algorithm stati > 2012-08-08 00:59:03,454 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Vertices - > Mean: 10000, Min: Work > 2012-08-08 00:59:03,454 INFO > org.apache.giraph.graph.partition.PartitionUtils > (org.apache.giraph.graph.MasterThread): analyzePartitionStats: Edges - > Mean: 19996, Min: Worker( > 2012-08-08 00:59:03,458 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): barrierOnWorkerList: 0 out of 1 > workers finished on superstep 98 > 2012-08-08 00:59:03,536 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): aggregateWorkerStats: Aggregation > found (vtx=10000,finVtx=10000, > 2012-08-08 00:59:03,539 INFO org.apache.giraph.graph.BspServiceMaster > (org.apache.giraph.graph.MasterThread): coordinateSuperstep: Cleaning up > old Superstep /_hadoopBsp/job_2 > 2012-08-08 00:59:03,579 INFO org.apache.giraph.graph.MasterThread > (org.apache.giraph.graph.MasterThread): masterThread: Coordination of > superstep 98 took 0.128 seconds ended > 2012-08-08 00:59:03,579 FATAL org.apache.giraph.graph.GraphMapper > (org.apache.giraph.graph.MasterThread): uncaughtException: > OverrideExceptionHandler on thread org.apache.gir > org.apache.hadoop.mapred.Counters$CountersExceededException: Error: > Exceeded limits on number of counters - Counters=120 Limit=120 > at > org.apache.hadoop.mapred.Counters$Group.getCounterForName(Counters.java:316) > at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:450) > at > org.apache.hadoop.mapred.Task$TaskReporter.getCounter(Task.java:601) > at > org.apache.hadoop.mapred.Task$TaskReporter.getCounter(Task.java:541) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.getCounter(TaskInputOutputContext.java:88) > at org.apache.giraph.graph.MasterThread.run(MasterThread.java:131) > 2012-08-08 00:59:03,580 WARN org.apache.giraph.zk.ZooKeeperManager > (Thread-14): onlineZooKeeperServers: Forced a shutdown hook kill of the > ZooKeeper process > > > > Vishal > > >
