Hi Eli, Yes, I am running this on Hadoop 0.23. I was using giraph from trunk (last updated 10th October). Is it incompatible with Yarn? Coz' I see a hadoop 0.23 profile and that's what I built.
Thanks, Tripti. From: Eli Reisman <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Saturday, December 1, 2012 1:34 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Issue running Giraph on more mappers You're running on a YARN-based cluster? On Wed, Nov 28, 2012 at 10:09 PM, Tripti Singh <[email protected]<mailto:[email protected]>> wrote: Hi, I am trying to run this workflow which uses Giraph. I am able to succesfully run the Giraph job when I use lesser no. of mappers and less data. But it fails for more mappers. This is what the logs say for master and worker nodes: Master Node: 2012-11-29 00:01:10,235 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>! 2012-11-29 00:01:10,235 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/_zkServer/gsta31113.tan.ygrid.yahoo.com<http://gsta31113.tan.ygrid.yahoo.com> 3 2012-11-29 00:01:10,241 INFO [main] org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceMaster (master thread)... 2012-11-29 00:01:10,257 INFO [main] org.apache.giraph.graph.BspService: BspService: Connecting to ZooKeeper with job job_1353148790244_114419, 3 on gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681> 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:host.name<http://host.name>=gsta31113.tan.ygrid.yahoo.com<http://gsta31113.tan.ygrid.yahoo.com> 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_21 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.home=/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.class.path= {really long class path} 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/lib/i386/server:/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/lib/i386:/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/../lib/i386:/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000009:/home/gs/hadoop/current/lib/native/Linux-i386-32:/usr/java/packages/lib/i386:/lib:/usr/lib 2012-11-29 00:01:10,278 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000009/tmp 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.compiler= 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.name<http://os.name>=Linux 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.arch=i386 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.18-238.19.1.el5.YAHOO.20111028 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.name<http://user.name>=nova_sln 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/nova_sln 2012-11-29 00:01:10,279 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000009 2012-11-29 00:01:10,280 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681> sessionTimeout=60000 watcher=org.apache.giraph.graph.BspServiceMaster@16f70a4 2012-11-29 00:01:10,304 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2012-11-29 00:01:10,305 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Socket connection established to gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, initiating session2012-11-29 00:01:10,331 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, sessionid = 0x13b497783e40000, negotiated timeout = 600000 2012-11-29 00:01:10,333 INFO [main-EventThread] org.apache.giraph.graph.BspService: process: Asynchronous connection complete. 2012-11-29 00:01:10,335 INFO [main] org.apache.giraph.graph.GraphMapper: map: No need to do anything when not a worker 2012-11-29 00:01:10,335 INFO [main] org.apache.giraph.graph.GraphMapper: cleanup: Starting for MASTER_ZOOKEEPER_ONLY 2012-11-29 00:01:10,396 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_1353148790244_114419/_masterElectionDir/gsta31113.tan.ygrid.yahoo.com_30000000000' and my bid is '/_hadoopBsp/job_1353148790244_114419/_masterElectionDir/gsta31113.tan.ygrid.yahoo.com_30000000000' 2012-11-29 00:01:10,403 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: becomeMaster: I am now the master! 2012-11-29 00:01:10,423 INFO [main-EventThread] org.apache.giraph.graph.BspService: process: applicationAttemptChanged signaled 2012-11-29 00:01:10,440 WARN [main-EventThread] org.apache.giraph.graph.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_1353148790244_114419/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected) 2012-11-29 00:01:17,475 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: checkWorkers: Only found 1 responses of 60 needed to start superstep -1. Sleeping for 5000 msecs and used 0 of 60 attempts. 2012-11-29 00:01:28,742 INFO [org.apache.giraph.graph.MasterThread] org.apache.hadoop.mapreduce.lib.input.FileInputFormat: Total input paths to process : 60 2012-11-29 00:01:28,760 WARN [org.apache.giraph.graph.MasterThread] org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library not loaded 2012-11-29 00:01:28,887 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: generateInputSplits: Got 240 input splits for 60 workers 2012-11-29 00:01:28,887 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: createInputSplits: Starting to write input split data to zookeeper with 1 threads 2012-11-29 00:01:29,228 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.BspServiceMaster: createInputSplits: Done writing input split data to zookeeper 2012-11-29 00:01:29,348 INFO [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.partition.HashMasterPartitioner: createInitialPartitionOwners: Creating 3600, default would have been 3600 partitions. 2012-11-29 00:01:29,348 WARN [org.apache.giraph.graph.MasterThread] org.apache.giraph.graph.partition.HashMasterPartitioner: createInitialPartitionOwners: Reducing the partitionCount to 2995 from 3600 2012-11-29 00:08:09,352 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 400000ms for sessionid 0x13b497783e40000, closing socket connection and attempting reconnect 2012-11-29 00:08:09,454 WARN [main-EventThread] org.apache.giraph.graph.BspService: process: Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent state:Disconnected type:None path:null 2012-11-29 00:08:10,645 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2012-11-29 00:08:10,645 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Socket connection established to gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, initiating session2012-11-29 00:08:10,648 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, sessionid = 0x13b497783e40000, negotiated timeout = 600000 2012-11-29 00:08:10,649 INFO [main-EventThread] org.apache.giraph.graph.BspService: process: Asynchronous connection complete. 2012-11-29 00:31:51,715 INFO [Thread-11] org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started. 2012-11-29 00:31:51,715 WARN [Thread-11] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process. 2012-11-29 00:31:52,094 INFO [Thread-11] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed). 2012-11-29 00:31:52,093 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x13b497783e40000, likely server has closed socket, closing socket connection and attempting reconnect Failed Worker Node: 2012-11-29 00:01:21,666 INFO [main] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [gsta31113.tan.ygrid.yahoo.com<http://gsta31113.tan.ygrid.yahoo.com>] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0 2012-11-29 00:01:21,666 INFO [main] org.apache.giraph.graph.GraphMapper: setup: Starting up BspServiceWorker... 2012-11-29 00:01:21,679 INFO [main] org.apache.giraph.graph.BspService: BspService: Connecting to ZooKeeper with job job_1353148790244_114419, 4 on gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681> 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.4-1386507, built on 09/17/2012 08:33 GMT 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:host.name<http://host.name>=gsta31090.tan.ygrid.yahoo.com<http://gsta31090.tan.ygrid.yahoo.com> 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_21 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.home=/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.class.path={really long class path} 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/lib/i386/server:/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/lib/i386:/home/Releases/gridjdk-1.6.0_21.1011192346-20110120-000/share/gridjdk-1.6.0_21/jre/../lib/i386:/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000120:/home/gs/hadoop/current/lib/native/Linux-i386-32:/usr/java/packages/lib/i386:/lib:/usr/lib 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000120/tmp 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:java.compiler= 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.name<http://os.name>=Linux 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.arch=i386 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.18-238.19.1.el5.YAHOO.20111028 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.name<http://user.name>=nova_sln 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.home=/homes/nova_sln 2012-11-29 00:01:21,694 INFO [main] org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/grid/2/tmp/yarn-local/usercache/nova_sln/appcache/application_1353148790244_114419/container_1353148790244_114419_01_000120 2012-11-29 00:01:21,695 INFO [main] org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681> sessionTimeout=60000 watcher=org.apache.giraph.graph.BspServiceWorker@1c8fb4b 2012-11-29 00:01:21,737 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2012-11-29 00:01:21,737 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Socket connection established to gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, initiating session2012-11-29 00:01:21,744 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Session establishment complete on server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>, sessionid = 0x13b497783e40017, negotiated timeout = 600000 2012-11-29 00:01:21,747 INFO [main-EventThread] org.apache.giraph.graph.BspService: process: Asynchronous connection complete. 2012-11-29 00:01:21,754 WARN [main] org.apache.hadoop.conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 2012-11-29 00:01:22,027 INFO [main] org.apache.giraph.comm.SecureRPCCommunications: getRPCServer: Added jobToken Ident: 18 6a 6f 62 5f 31 33 35 33 31 34 38 37 39 30 32 34 34 5f 31 31 34 34 31 39, Kind: mapreduce.job, Service: job_1353148790244_114419 2012-11-29 00:01:22,608 INFO [Socket Reader #1 for port 32504] org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 32504 2012-11-29 00:01:22,609 INFO [Socket Reader #2 for port 32504] org.apache.hadoop.ipc.Server: Starting Socket Reader #2 for port 32504 2012-11-29 00:01:22,609 INFO [Socket Reader #3 for port 32504] org.apache.hadoop.ipc.Server: Starting Socket Reader #3 for port 32504 2012-11-29 00:01:22,609 INFO [Socket Reader #4 for port 32504] org.apache.hadoop.ipc.Server: Starting Socket Reader #4 for port 32504 2012-11-29 00:01:22,610 INFO [Socket Reader #5 for port 32504] org.apache.hadoop.ipc.Server: Starting Socket Reader #5 for port 32504 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: dfs.namenode.name.dir; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.security.token.service.use_ip; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.admin.map.child.java.opts; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: yarn.app.mapreduce.am.job.client.port-range; Ignoring. 2012-11-29 00:01:22,662 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: mapreduce.admin.reduce.child.java.opts; Ignoring. 2012-11-29 00:01:22,663 WARN [main] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.tmp.dir; Ignoring. 2012-11-29 00:01:22,691 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2012-11-29 00:01:22,691 INFO [IPC Server listener on 32504] org.apache.hadoop.ipc.Server: IPC Server listener on 32504: starting 2012-11-29 00:01:22,707 INFO [main] org.apache.giraph.comm.BasicRPCCommunications: BasicRPCCommunications: Started RPC communication server: gsta31090.tan.ygrid.yahoo.com/10.216.123.42:32504<http://gsta31090.tan.ygrid.yahoo.com/10.216.123.42:32504> with 61 handlers and 59 flush threads on bind attempt 0 2012-11-29 00:01:22,707 INFO [main] org.apache.giraph.graph.BspServiceWorker: BspServiceWorker: maxVerticesPerTransfer = 10000 2012-11-29 00:01:22,707 INFO [main] org.apache.giraph.graph.BspServiceWorker: BspServiceWorker: maxEdgesPerTransfer = 80000 useNetty = false 2012-11-29 00:01:22,716 INFO [main] org.apache.giraph.graph.GraphMapper: setup: Registering health of this worker... 2012-11-29 00:01:22,733 INFO [main] org.apache.giraph.graph.BspService: getJobState: Job state already exists (/_hadoopBsp/job_1353148790244_114419/_masterJobState) 2012-11-29 00:01:22,738 INFO [main] org.apache.giraph.graph.BspService: getApplicationAttempt: Node /_hadoopBsp/job_1353148790244_114419/_applicationAttemptsDir already exists! 2012-11-29 00:01:22,741 INFO [main] org.apache.giraph.graph.BspService: getApplicationAttempt: Node /_hadoopBsp/job_1353148790244_114419/_applicationAttemptsDir already exists! 2012-11-29 00:01:22,747 INFO [main] org.apache.giraph.graph.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_1353148790244_114419/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta31090.tan.ygrid.yahoo.com_4 and workerInfo= Worker(hostname=gsta31090.tan.ygrid.yahoo.com<http://gsta31090.tan.ygrid.yahoo.com>, MRtaskID=4, port=32504) 2012-11-29 00:19:17,005 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 24917 may have finished in the interim. 2012-11-29 00:19:17,005 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 24921 may have finished in the interim. 2012-11-29 00:19:17,006 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 24922 may have finished in the interim. 2012-11-29 00:27:37,081 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 25739 may have finished in the interim. 2012-11-29 00:27:37,081 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 25743 may have finished in the interim. 2012-11-29 00:27:37,081 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 25744 may have finished in the interim. 2012-11-29 00:28:07,200 INFO [communication thread] org.apache.hadoop.yarn.util.ProcfsBasedProcessTree: The process 25752 may have finished in the interim. 2012-11-29 00:31:52,091 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x13b497783e40017, likely server has closed socket, closing socket connection and attempting reconnect 2012-11-29 00:31:52,193 WARN [main-EventThread] org.apache.giraph.graph.BspService: process: Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent state:Disconnected type:None path:null 2012-11-29 00:31:53,478 INFO [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Opening socket connection to server gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681<http://gsta31113.tan.ygrid.yahoo.com/10.216.124.59:24681>. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2012-11-29 00:31:53,480 WARN [main-SendThread(gsta31113.tan.ygrid.yahoo.com:24681<http://gsta31113.tan.ygrid.yahoo.com:24681>)] org.apache.zookeeper.ClientCnxn: Session 0x13b497783e40017 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:348) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2012-11-29 00:31:53,584 ERROR [main] org.apache.giraph.graph.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_1353148790244_114419/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta31090.tan.ygrid.yahoo.com_4 on superstep -1 Please let me know if I am missing on some configurations. Thanks, Tripti.
