[ https://issues.apache.org/jira/browse/GIRAPH-169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240216#comment-13240216 ]
Jianfeng Qian commented on GIRAPH-169: -------------------------------------- The master mapper's log 2012-03-28 10:27:16,286 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-03-28 10:27:16,551 WARN org.apache.giraph.bsp.BspOutputFormat: getOutputCommitter: Returning ImmutableOutputCommiter (does nothing). 2012-03-28 10:27:16,562 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2012-03-28 10:27:16,570 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4bf54c5f 2012-03-28 10:27:16,662 INFO org.apache.giraph.graph.GraphMapper: Distributed cache is empty. Assuming fatjar. 2012-03-28 10:27:16,662 INFO org.apache.giraph.graph.GraphMapper: setup: classpath @ /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar 2012-03-28 10:27:16,678 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201203281017_0001 2012-03-28 10:27:16,681 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/tmm-e10 0 2012-03-28 10:27:17,491 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [tmm-e10] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0 2012-03-28 10:27:17,491 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201203281017_0001/zkServerList_tmm-e10 0 ' 2012-03-28 10:27:17,509 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_tmm-e10 0 ' (polling period is 3000) 2012-03-28 10:27:17,509 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [tmm-e10, 0] 2 hosts in filename 'zkServerList_tmm-e10 0 ' 2012-03-28 10:27:17,510 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper 2012-03-28 10:27:17,512 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper/zoo.cfg in /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper with base port 22181 2012-03-28 10:27:17,513 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true 2012-03-28 10:27:17,513 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false 2012-03-28 10:27:17,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/local/java/jdk1.6.0_22/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper/zoo.cfg] in directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper 2012-03-28 10:27:17,522 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to tmm-e10:22181 with poll msecs = 3000 2012-03-28 10:27:17,532 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:661) at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:425) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:646) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to tmm-e10:22181 with poll msecs = 3000 2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to tmm-e10/2.1.1.130:22181! 2012-03-28 10:27:20,533 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_zkServer/tmm-e10 0 2012-03-28 10:27:20,550 INFO org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceMaster (master thread)... 2012-03-28 10:27:20,562 INFO org.apache.giraph.graph.BspService: BspService: Connecting to ZooKeeper with job job_201203281017_0001, 0 on tmm-e10:22181 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=tmm-e10 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_22 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/local/java/jdk1.6.0_22/jre 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/classes:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../conf:/usr/local/java/jdk1.6.0_22/lib/tools.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/hadoop-core-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/asm-3.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjrt-1.6.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/aspectjtools-1.6.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-cli-1.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-codec-1.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-collections-3.2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-configuration-1.6.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-daemon-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-digester-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-el-1.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-lang-2.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-1.1.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-math-2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/commons-net-1.4.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/core-3.1.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-capacity-scheduler-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-fairscheduler-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hadoop-thriftfs-0.20.205.0.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-core-asl-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jackson-mapper-asl-1.0.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jdeb-0.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-core-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-json-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jersey-server-1.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jets3t-0.6.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-6.1.26.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jetty-util-6.1.26.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsch-0.1.42.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/junit-4.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/kfs-0.2.2.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/log4j-1.2.15.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/mockito-all-1.8.5.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/oro-2.0.8.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-api-1.4.3.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/xmlenc-0.52.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/local/test-0302/hadoop-0.20.205.0/libexec/../share/hadoop/lib/jsp-2.1/jsp-api-2.1.jar2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/local/test-0302/hadoop-0.20.205.0/libexec/../lib:/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work/tmp 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32.12-0.7-default 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=root 2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/root2012-03-28 10:27:20,571 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/attempt_201203281017_0001_m_000000_0/work 2012-03-28 10:27:20,572 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=tmm-e10:22181 sessionTimeout=60000 watcher=org.apache.giraph.graph.BspServiceMaster@68a53de4 2012-03-28 10:27:20,584 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server tmm-e10/2.1.1.130:22181 2012-03-28 10:27:20,586 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to tmm-e10/2.1.1.130:22181, initiating session 2012-03-28 10:27:21,415 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server tmm-e10/2.1.1.130:22181, sessionid = 0x1365720ee4e0000, negotiated timeout = 300000 2012-03-28 10:27:21,417 INFO org.apache.giraph.graph.BspService: process: Asynchronous connection complete. 2012-03-28 10:27:21,420 INFO org.apache.giraph.graph.GraphMapper: map: No need to do anything when not a worker 2012-03-28 10:27:21,420 INFO org.apache.giraph.graph.GraphMapper: cleanup: Starting for MASTER_ZOOKEEPER_ONLY2012-03-28 10:27:21,482 INFO org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201203281017_0001/_masterElectionDir/tmm-e10_00000000000' and my bid is '/_hadoopBsp/job_201203281017_0001/_masterElectionDir/tmm-e10_00000000000' 2012-03-28 10:27:21,489 INFO org.apache.giraph.graph.BspServiceMaster: becomeMaster: I am now the master! 2012-03-28 10:27:21,526 INFO org.apache.giraph.graph.BspService: process: applicationAttemptChanged signaled2012-03-28 10:27:21,579 WARN org.apache.giraph.graph.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected) 2012-03-28 10:27:41,254 INFO org.apache.giraph.graph.BspServiceMaster: generateInputSplits: Got 64 input splits for 64 workers2012-03-28 10:27:42,265 INFO org.apache.giraph.graph.partition.HashMasterPartitioner: createInitialPartitionOwners: Creating 4096, default would have been 4096 partitions.2012-03-28 10:27:42,265 WARN org.apache.giraph.graph.partition.HashMasterPartitioner: createInitialPartitionOwners: Reducing the partitionCount to 2995 from 40962012-03-28 10:27:42,372 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep -1 on path /_hadoopBsp/job_201203281017_0001/_inputSplitDoneDir2012-03-28 10:27:52,440 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep -1 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/-1/_workerFinishedDir2012-03-28 10:27:52,827 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=0) on superstep = -12012-03-28 10:27:52,859 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep -1 took 10.669 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 02012-03-28 10:27:53,171 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static2012-03-28 10:27:53,178 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:27:53,178 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:27:53,252 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 0 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/0/_workerFinishedDir2012-03-28 10:27:58,718 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 02012-03-28 10:27:59,032 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 0 took 6.173 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 12012-03-28 10:27:59,326 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static2012-03-28 10:27:59,330 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:27:59,330 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:27:59,467 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 1 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/1/_workerFinishedDir2012-03-28 10:28:02,994 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 1 2012-03-28 10:28:03,004 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/02012-03-28 10:28:04,749 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 1 took 5.717 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 2 2012-03-28 10:28:04,787 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static2012-03-28 10:28:04,790 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:28:04,790 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:04,863 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 2 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/2/_workerFinishedDir 2012-03-28 10:28:08,324 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 2 2012-03-28 10:28:08,335 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/12012-03-28 10:28:10,015 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 2 took 5.265 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 3 2012-03-28 10:28:10,049 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static2012-03-28 10:28:10,049 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:28:10,050 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:10,129 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 3 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/3/_workerFinishedDir 2012-03-28 10:28:13,320 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 32012-03-28 10:28:13,331 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/2 2012-03-28 10:28:15,083 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 3 took 5.068 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 4 2012-03-28 10:28:15,120 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static 2012-03-28 10:28:15,121 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:28:15,121 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:15,183 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 4 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4/_workerFinishedDir 2012-03-28 10:28:18,365 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=0,edges=16000000,msgCount=748805328) on superstep = 42012-03-28 10:28:18,374 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/3 2012-03-28 10:28:20,139 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 4 took 5.056 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 5 2012-03-28 10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static2012-03-28 10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 15625, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 15359, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 156932012-03-28 10:28:20,172 INFO org.apache.giraph.graph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 250000, Min: Worker(hostname=tmm-e4, MRpartition=59, port=30059) - 245744, Max: Worker(hostname=tmm-e4, MRpartition=32, port=30032) - 2510882012-03-28 10:28:20,224 INFO org.apache.giraph.graph.BspServiceMaster: barrierOnWorkerList: 0 out of 64 workers finished on superstep 5 on path /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/5/_workerFinishedDir2012-03-28 10:28:20,639 INFO org.apache.giraph.graph.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=1000000,finVtx=1000000,edges=16000000,msgCount=0) on superstep = 5 2012-03-28 10:28:20,651 INFO org.apache.giraph.graph.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4 2012-03-28 10:28:20,945 WARN org.apache.giraph.graph.BspServiceMaster: coordinateBarrier: Already cleaned up /_hadoopBsp/job_201203281017_0001/_applicationAttemptsDir/0/_superstepDir/4 2012-03-28 10:28:20,945 INFO org.apache.giraph.graph.MasterThread: masterThread: Coordination of superstep 5 took 0.806 seconds ended with state ALL_SUPERSTEPS_DONE and is now on superstep 6 2012-03-28 10:28:20,946 INFO org.apache.giraph.graph.BspServiceMaster: setJobState: {"_stateKey":"FINISHED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep 62012-03-28 10:28:20,987 INFO org.apache.giraph.graph.BspServiceMaster: cleanup: Notifying master its okay to cleanup with /_hadoopBsp/job_201203281017_0001/_cleanedUpDir/0_master 2012-03-28 10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanUpZooKeeper: Node /_hadoopBsp/job_201203281017_0001/_cleanedUpDir already exists, no need to create.2012-03-28 10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanUpZooKeeper: Got 65 of 65 desired children from /_hadoopBsp/job_201203281017_0001/_cleanedUpDir 2012-03-28 10:28:20,993 INFO org.apache.giraph.graph.BspServiceMaster: cleanupZooKeeper: Removing the following path and all children - /_hadoopBsp/job_201203281017_00012012-03-28 10:28:24,840 INFO org.apache.giraph.graph.BspService: process: masterElectionChildrenChanged signaled 2012-03-28 10:28:25,321 INFO org.apache.giraph.graph.BspService: process: cleanedUpChildrenChanged signaled 2012-03-28 10:28:25,757 INFO org.apache.giraph.graph.BspServiceMaster: cleanup: Removed HDFS checkpoint directory (_bsp/_checkpoints//job_201203281017_0001) with return = true since this job succeeded 2012-03-28 10:28:25,771 INFO org.apache.zookeeper.ZooKeeper: Session: 0x1365720ee4e0000 closed 2012-03-28 10:28:25,771 INFO org.apache.giraph.graph.MasterThread: setup: Took 20.768 seconds. 2012-03-28 10:28:25,771 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down2012-03-28 10:28:25,771 INFO org.apache.giraph.graph.MasterThread: vertex input superstep: Took 10.669 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 0: Took 6.173 seconds.2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 1: Took 5.717 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 2: Took 5.265 seconds.2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 3: Took 5.068 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 4: Took 5.056 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: superstep 5: Took 0.806 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: shutdown: Took 4.827 seconds. 2012-03-28 10:28:25,772 INFO org.apache.giraph.graph.MasterThread: total: Took 1.3329016850040002E9 seconds. 2012-03-28 10:28:25,773 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperClosedStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/0.COMPUTATION_DONE2012-03-28 10:28:25,813 INFO org.apache.giraph.zk.ZooKeeperManager: waitUntilAllTasksDone: Got 65 and 65 desired (polling period is 3000) on attempt 02012-03-28 10:28:26,139 INFO org.apache.giraph.zk.ZooKeeperManager: offlineZooKeeperServers: waitFor returned 143 and deleted directory /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/work/_bspZooKeeper 2012-03-28 10:28:26,140 INFO org.apache.hadoop.mapred.Task: Task:attempt_201203281017_0001_m_000000_0 is done. And is in the process of commiting2012-03-28 10:28:26,455 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201203281017_0001_m_000000_0' done. 2012-03-28 10:28:26,461 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 ------------------------------------------------------------------------------------------------------------------------------ one of the workers 2012-03-28 10:18:00,122 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-03-28 10:18:00,387 WARN org.apache.giraph.bsp.BspOutputFormat: getOutputCommitter: Returning ImmutableOutputCommiter (does nothing). 2012-03-28 10:18:00,397 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2012-03-28 10:18:00,405 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@18330bf 2012-03-28 10:18:00,489 INFO org.apache.giraph.graph.GraphMapper: Distributed cache is empty. Assuming fatjar. 2012-03-28 10:18:00,489 INFO org.apache.giraph.graph.GraphMapper: setup: classpath @ /usr/local/test-0302/hadoop-data/h-0.20.205/mapred/local/taskTracker/root/jobcache/job_201203281017_0001/jars/job.jar 2012-03-28 10:18:00,498 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201203281017_0001 2012-03-28 10:18:00,500 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201203281017_0001/_task/tmm-e6 1 2012-03-28 10:18:00,521 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_tmm-e10 0 ' (polling period is 3000) 2012-03-28 10:18:00,521 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [tmm-e10, 0] 2 hosts in filename 'zkServerList_tmm-e10 0 ' 2012-03-28 10:18:00,524 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [tmm-e10] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0 2012-03-28 10:18:00,524 INFO org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceWorker... 2012-03-28 10:18:00,534 INFO org.apache.giraph.graph.BspService: BspService: Connecting to ZooKeeper with job job_201203281017_0001, 1 on tmm-e10:22181 2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT 2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=tmm-e6 2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_22 2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 2012-03-28 10:18:00,540 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/local/java/jdk1.6.0_22/jre > How to close all child when a job finished? > ------------------------------------------- > > Key: GIRAPH-169 > URL: https://issues.apache.org/jira/browse/GIRAPH-169 > Project: Giraph > Issue Type: Improvement > Components: mapreduce > Affects Versions: 0.2.0 > Environment: sles 11 x64,jdk 1.6,hadoop 0.20.205.0,1 Master and 8 > slaves, > Reporter: Jianfeng Qian > Priority: Minor > > I ran pagerank at hadoop 0.20.205.0. When the job finished,the child in > slaves didn't quit immediately and sometimes they never quit and I have to > kill them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira