It appears that you had a problem with the serialization/deserialization of your vertex and/or its types (I, E, V, M). You might want to try to test that out separately.

Avery

On 1/3/12 3:54 AM, "Christoph Böhm" wrote:
Thanks!
The next exception I cannot explain myself is the following.
I have one input file of the form:
[2095029,[[1100046950,-1],[952771928,-1]],[[1276522248,0.9829082],[322609086,0.013525307]]]
[5146036,[[947366954,-1],[34019593,-1]],[[1199061143,0.573876],[1024309140,0.98412496]]]
[5270429,[[800028028,-1],[1362541830,-1]],[[164325925,0.92203426],[148512084,0.65505975]]]
... and want to use say 5 workers.
Then worker tenem05 reports what is below.

Cheers.
Christoph

--------------
java.lang.RuntimeException: java.io.IOException: Call to 
tenem02//172.16.23.151:30003 failed on local exception: java.io.EOFException
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
        at 
org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
        at 
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on 
local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
        at org.apache.hadoop.ipc.Client.call(Client.java:1033)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
        at $Proxy3.putVertexList(Unknown Source)
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
        ... 11 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,259 ERROR org.apache.giraph.graph.GraphMapper: setup: 
Caught exception just before end of setup
java.lang.IllegalStateException: setup: loadVertices failed
        at 
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.RuntimeException: java.io.IOException: Call to 
tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
        at 
org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
        at 
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
        ... 9 more
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on 
local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
        at org.apache.hadoop.ipc.Client.call(Client.java:1033)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
        at $Proxy3.putVertexList(Unknown Source)
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
        ... 11 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,260 ERROR org.apache.giraph.graph.BspServiceWorker: 
unregisterHealth: Got failure, unregistering health on 
/_hadoopBsp/job_201112231316_4347/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/tenem05_1
 on superstep -1
2012-01-03 12:35:46,270 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: 
Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: Got 
UserName hadoop00 for UID 503 from the native implementation
2012-01-03 12:35:46,322 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception setup: 
Offlining servers due to exception...
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.RuntimeException: setup: Offlining servers due to 
exception...
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
        ... 7 more
Caused by: java.lang.IllegalStateException: setup: loadVertices failed
        at 
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
        ... 8 more
Caused by: java.lang.RuntimeException: java.io.IOException: Call to 
tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
        at 
org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
        at 
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
        ... 9 more
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on 
local exception: java.io.EOFException
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
        at org.apache.hadoop.ipc.Client.call(Client.java:1033)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
        at $Proxy3.putVertexList(Unknown Source)
        at 
org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
        ... 11 more
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,337 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task





-------- Original-Nachricht --------
Datum: Fri, 23 Dec 2011 09:25:24 -0800
Von: Avery Ching<ach...@apache.org>
An: giraph-user@incubator.apache.org
Betreff: Re: zookeeper connection issue
Yeah, of those errors can seem a little scary.  But I think they are
mostly harmless.  Let's go over each one inline.

On 12/23/11 7:10 AM, "Christoph Böhm" wrote:
Hi List,

I'm about to get started with Giraph and have a few of questions:
when running the Pagrank example with
     hadoop jar giraph-0.70-jar-with-dependencies.jar
org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 500000 -w 10
this finishes but I find the following in one worker's logs:

*** Worker:
2011-12-23 15:36:09,468 ERROR org.apache.zookeeper.ClientCnxn: Error
while calling watcher
java.lang.RuntimeException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for
/_hadoopBsp/job_201112231316_0010/_masterJobState
        at org.apache.giraph.graph.BspService.getJobState(BspService.java:564)
        at
org.apache.giraph.graph.BspServiceWorker.processEvent(BspServiceWorker.java:1414)
        at org.apache.giraph.graph.BspService.process(BspService.java:1017)
        at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for
/_hadoopBsp/job_201112231316_0010/_masterJobState
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
        at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:99)
        at org.apache.giraph.graph.BspService.getJobState(BspService.java:555)
        ... 4 more
Depends when this happens.  If it's after the worker has let the master
know that it was finished with everything, this is fine.

*** The Master says:
2011-12-23 15:45:40,564 WARN org.apache.giraph.zk.ZooKeeperManager:
onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:525)
        at
org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:408)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)



Also, when I'm trying to run my own Job I see the following. All
firewalls etc. should be shutdown.
*** Master (node09.de):
2011-12-23 15:57:47,140 INFO org.apache.giraph.zk.ZooKeeperManager:
onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
node09.de:22181 with poll msecs = 3000
2011-12-23 15:57:47,143 WARN org.apache.giraph.zk.ZooKeeperManager:
onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:525)
        at
org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
        at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:409)
        at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)



Thanks again.
Christoph
These two exceptions on the master are also fine.  It takes some time
for the master to start the zk service (hence the multiple connection
attempts).

Reply via email to