[jira] [Commented] (GIRAPH-154) Worker ports are not synched properly with its peers

2012-03-17 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232015#comment-13232015
 ] 

Hudson commented on GIRAPH-154:
---

Integrated in Giraph-trunk-Commit #86 (See 
[https://builds.apache.org/job/Giraph-trunk-Commit/86/])
GIRAPH-154: Worker ports are not synched properly with its peers
(Zhiwei Gu via aching). (Revision 1301962)

 Result = SUCCESS
aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1301962
Files : 
* /incubator/giraph/trunk/CHANGELOG
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
* 
/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
* 
/incubator/giraph/trunk/src/test/java/org/apache/giraph/examples/TryMultiRpcBindingPortsTest.java


> Worker ports are not synched properly with its peers
> 
>
> Key: GIRAPH-154
> URL: https://issues.apache.org/jira/browse/GIRAPH-154
> Project: Giraph
>  Issue Type: Bug
>  Components: bsp
>Affects Versions: 0.2.0
>Reporter: Zhiwei Gu
>Assignee: Zhiwei Gu
> Attachments: GIRAPH-154.patch
>
>
> When worker trying multiple ports to setup the rpc server, the final port is 
> not synched with it's peer workers properly, and resulted in peer workers 
> send message to the default port.
> Here is some logs:
> 
> Base port: 34900
> 
> 
> log for worker 161:
> 
> IPC Server handler 98 on 36061: starting
> BasicRPCCommunications: Started RPC communication server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:36061 with 100 handlers and 199 
> flush threads on bind attempt 1
> IPC Server handler 99 on 36061: starting
> setup: Registering health of this worker...
> getJobState: Job state already exists 
> (/_hadoopBsp/job_201203130609_14838/_masterJobState)
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
> registerHealth: Created my health node for attempt=0, superstep=-1 with 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta32085.tan.ygrid.yahoo.com_161
>  and workerInfo= Worker(hostname=gsta32085.tan.ygrid.yahoo.com, 
> MRpartition=161, port=35061)
> process: partitionAssignmentsReadyChanged (partitions are assigned)
> startSuperstep: Ready for computation on superstep -1 since worker selection 
> and vertex range assignments are done in 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_partitionAssignments
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 0 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 1 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 2 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 3 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 4 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 5 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 6 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 7 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 8 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 9 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 10 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 11 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 12 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 13 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 14 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 15 time(s).
> Retrying connect to server: 
> gsta32

[jira] [Commented] (GIRAPH-154) Worker ports are not synched properly with its peers

2012-03-17 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232012#comment-13232012
 ] 

Avery Ching commented on GIRAPH-154:


Nice work Zhiwei (+1), I verified it as well and committed.  Will close once 
Hudson verifies as well.

> Worker ports are not synched properly with its peers
> 
>
> Key: GIRAPH-154
> URL: https://issues.apache.org/jira/browse/GIRAPH-154
> Project: Giraph
>  Issue Type: Bug
>  Components: bsp
>Affects Versions: 0.2.0
>Reporter: Zhiwei Gu
>Assignee: Zhiwei Gu
> Attachments: GIRAPH-154.patch
>
>
> When worker trying multiple ports to setup the rpc server, the final port is 
> not synched with it's peer workers properly, and resulted in peer workers 
> send message to the default port.
> Here is some logs:
> 
> Base port: 34900
> 
> 
> log for worker 161:
> 
> IPC Server handler 98 on 36061: starting
> BasicRPCCommunications: Started RPC communication server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:36061 with 100 handlers and 199 
> flush threads on bind attempt 1
> IPC Server handler 99 on 36061: starting
> setup: Registering health of this worker...
> getJobState: Job state already exists 
> (/_hadoopBsp/job_201203130609_14838/_masterJobState)
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
> getApplicationAttempt: Node 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir already exists!
> registerHealth: Created my health node for attempt=0, superstep=-1 with 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/gsta32085.tan.ygrid.yahoo.com_161
>  and workerInfo= Worker(hostname=gsta32085.tan.ygrid.yahoo.com, 
> MRpartition=161, port=35061)
> process: partitionAssignmentsReadyChanged (partitions are assigned)
> startSuperstep: Ready for computation on superstep -1 since worker selection 
> and vertex range assignments are done in 
> /_hadoopBsp/job_201203130609_14838/_applicationAttemptsDir/0/_superstepDir/-1/_partitionAssignments
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 0 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 1 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 2 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 3 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 4 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 5 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 6 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 7 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 8 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 9 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 10 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 11 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 12 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 13 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 14 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 15 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 16 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 17 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 18 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 19 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried 20 time(s).
> Retrying connect to server: 
> gsta32085.tan.ygrid.yahoo.com/10.216.148.47:35061. Already tried