[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-27 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195286#comment-13195286
 ] 

Avery Ching commented on GIRAPH-128:


Thanks for taking a look.  I forgot to upload the original (rb only for that 
one), hence part 2. 

The main motivation for the obscure case is that it would make debugging 
simpler.  We often see errors like serverX:portY, and can use portY to figure 
out which mapper to look at.  For example, currently the default starts at 
3.  If I see an error from 30001, then I know to go to mapper 1 to see it's 
problem.  And so on and so forth.  If I am running a 900 mapper job then if 
it's 31001 or 32001 then I still know to look at mapper partition 1.  If 
instead I had a 100 as the constant, then if it's 30101, I have to check both 
mapper 1 and mapper 101.  With up to 20 retries per port, we can handle at 
least 20 simultaneous jobs running on a single machine that have the same 
mapper partition id.  First of, that is probably unlikely.  But even if it does 
happen, 20 is probably more than an one machine would handle.  By the way, port 
retries are very fast (so I wouldn't worry to much about collisions).

Let me resubmit without the whitespace changes and making MAX_BIND_ATTEMPTS 
configurable.

 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-27 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195330#comment-13195330
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-128:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3596/
---

(Updated 2012-01-28 01:15:26.114994)


Review request for giraph.


Changes
---

Removed whitspace changes for MinimumIntCombiner.java and 
SimpleSumCombiner.java and made GiraphJob.MAX_RPC_PORT_BIND_ATTEMPTS 
configurable, but default to 20.


Summary
---

Simple handling of port collisions on the same machine while preserving 
debugability from the port number alone.  Round up the max number of workers to 
the next power of 10 and use it as a constant to increase the port number with.

Added a unit test to ensure it is working correctly.

Fixed 2 minor warnings on
src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java

of removing 'import java.util.List'.


This addresses bug GIRAPH-128.
https://issues.apache.org/jira/browse/GIRAPH-128


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1236935 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
 1236935 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/test/java/org/apache/giraph/comm/RPCCommunicationsTest.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/3596/diff


Testing
---

Passed local and MR unittests.


Thanks,

Avery



 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-27 Thread Jakob Homan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195347#comment-13195347
 ] 

Jakob Homan commented on GIRAPH-128:


Any reason the question about mocks/extending the class wasn't addressed?

 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch, GIRAPH-128.3.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-27 Thread Jakob Homan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195382#comment-13195382
 ] 

Jakob Homan commented on GIRAPH-128:


Great, thanks.  +1.

 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch, GIRAPH-128.3.patch, 
 GIRAPH-128.4.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-25 Thread Avery Ching (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193579#comment-13193579
 ] 

Avery Ching commented on GIRAPH-128:


Anyone want to review?  I think this will be very useful to get in before the 
release since it lets users run multiple Giraph jobs on the same cluster 
simultaneously a lot easier...

 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-24 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192587#comment-13192587
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-128:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3596/
---

(Updated 2012-01-24 21:53:06.906563)


Review request for giraph.


Changes
---

Updated after GIRAPH-124 was committed.


Summary
---

Simple handling of port collisions on the same machine while preserving 
debugability from the port number alone.  Round up the max number of workers to 
the next power of 10 and use it as a constant to increase the port number with.

Added a unit test to ensure it is working correctly.

Fixed 2 minor warnings on
src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java

of removing 'import java.util.List'.


This addresses bug GIRAPH-128.
https://issues.apache.org/jira/browse/GIRAPH-128


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1235026 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
 1235026 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java
 1235026 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/test/java/org/apache/giraph/comm/RPCCommunicationsTest.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/3596/diff


Testing
---

Passed local and MR unittests.


Thanks,

Avery



 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching
 Attachments: GIRAPH-128.2.patch


 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (GIRAPH-128) RPC port from BasicRPCCommunications should be only a starting port, and retried

2012-01-23 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/GIRAPH-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191425#comment-13191425
 ] 

jirapos...@reviews.apache.org commented on GIRAPH-128:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3596/
---

Review request for giraph.


Summary
---

Simple handling of port collisions on the same machine while preserving 
debugability from the port number alone.  Round up the max number of workers to 
the next power of 10 and use it as a constant to increase the port number with.

Added a unit test to ensure it is working correctly.

Fixed 2 minor warnings on
src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java

of removing 'import java.util.List'.


This addresses bug GIRAPH-128.
https://issues.apache.org/jira/browse/GIRAPH-128


Diffs
-

  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/comm/BasicRPCCommunications.java
 1234970 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java
 1234970 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java
 1234970 
  
http://svn.apache.org/repos/asf/incubator/giraph/trunk/src/test/java/org/apache/giraph/comm/RPCCommunicationsTest.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/3596/diff


Testing
---

Passed local and MR unittests.


Thanks,

Avery



 RPC port from BasicRPCCommunications should be only a starting port, and 
 retried
 

 Key: GIRAPH-128
 URL: https://issues.apache.org/jira/browse/GIRAPH-128
 Project: Giraph
  Issue Type: Improvement
Affects Versions: 0.1.0
Reporter: Avery Ching
Assignee: Avery Ching

 Currently Giraph uses a basic port + the task partition to get the RPC port.  
 This doesn't work well for when there are multiple Giraph jobs running 
 simultaneously in the same Hadoop cluster (port conflict).  At the same time, 
 it is nice to use this simple algorithm because it makes it very easy to 
 debug problems (you can find the troublesome mapper from the RPC port name).  
 I will be proposing a simple scheme to retry with another port.  I will round 
 the total number of mappers up to the nearest power of 10 (let's that that 
 number Z).  Then I will increment the port number by Z, retrying up to 20 
 tries.  If you have enough ports, this scheme would guarantee that up to 20 
 mappers / node would be supported.  It should be sufficient for most 
 clusters.  At the same time, we still maintain the easy debugging method 
 since you it's still easy to figure out the mapper partition from the port 
 (port % Z = map partition). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira