[
https://issues.apache.org/jira/browse/GIRAPH-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakob Homan resolved GIRAPH-72.
-------------------------------
Resolution: Duplicate
This has been (hopefully) fixed by GIRAPH-128. Closing as duplicate.
> Running multiple Giraph jobs on the same cluster can lead to port collisions
> ----------------------------------------------------------------------------
>
> Key: GIRAPH-72
> URL: https://issues.apache.org/jira/browse/GIRAPH-72
> Project: Giraph
> Issue Type: Bug
> Components: lib, zookeeper
> Affects Versions: 0.1.0
> Environment: production hadoop cluster, in-process ZK.
> Reporter: Jake Mannix
>
> Had a Giraph mini-hackathon at work today, and lots of us launched
> simultaneous test jobs at the same time, and often ran into the following
> collision:
> ------
> startSuperstep: WORKER_ONLY - Attempt=0, Superstep=-1
> 2-Nov-2011 23:40:08
> java.net.BindException: Problem binding to <hostname>/<hostIP>:30000 :
> Address already in use
> at org.apache.hadoop.ipc.Server.bind(Server.java:196)
> at org.apache.hadoop.ipc.Server$Listener.(Server.java:259)
> at org.apache.hadoop.ipc.Server.(Server.java:1039)
> at org.apache.hadoop.ipc.RPC$Server.(RPC.java:492)
> at org.apache.hadoop.ipc.RPC.getServer(RPC.java:454)
> at
> org.apache.giraph.comm.RPCCommunications.getRPCServer(RPCCommunications.java:99)
> at
> org.apache.giraph.comm.BasicRPCCommunications.(BasicRPCCommunications.java:362)
> at org.apache.giraph.comm.RPCCommunications.(RPCCommunications.java:71)
> at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:570)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind(Native Method)
> at
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
> at org.apache.hadoop.ipc.Server.bind(Server.java:194)
> ... 12 more
> ----
> The job then simply hung. What it should do, I'd imagine, is at a bare
> minimum, catch this exception and allow the task to die quickly so it can get
> retried on another machine, or better yet, allow for a command-line arg at
> startup (and then passed into the Configuration) decide what ports to use.
> Best yet, something automagic which allows multiple GraphMappers on the same
> machine without manually picking ports (pick one at random and store it in
> zookeeper? but then what about the in-process zookeeper...)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira