[
https://issues.apache.org/jira/browse/GIRAPH-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435477#comment-13435477
]
Hudson commented on GIRAPH-300:
-------------------------------
Integrated in Giraph-trunk-Commit #173 (See
[https://builds.apache.org/job/Giraph-trunk-Commit/173/])
GIRAPH-300) Improve netty reliability with retrying failed
connections, tracking requests, thread-safe hash partitioning (aching
via apresta). (Revision 1373609)
Result = SUCCESS
aching : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1373609
Files :
* /giraph/trunk/CHANGELOG
* /giraph/trunk/pom.xml
* /giraph/trunk/src/main/java/org/apache/giraph/comm/NettyClient.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/NettyServer.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/RequestInfo.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/RequestServerHandler.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/ResponseClientHandler.java
*
/giraph/trunk/src/main/java/org/apache/giraph/comm/SendPartitionMessagesRequest.java
*
/giraph/trunk/src/main/java/org/apache/giraph/comm/SendPartitionMutationsRequest.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java
* /giraph/trunk/src/main/java/org/apache/giraph/comm/WritableRequest.java
* /giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
* /giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
*
/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
* /giraph/trunk/src/main/java/org/apache/giraph/utils/TimedLogger.java
* /giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java
> Improve netty reliability with retrying failed connections, tracking
> requests, thread-safe hash partitioning
> ------------------------------------------------------------------------------------------------------------
>
> Key: GIRAPH-300
> URL: https://issues.apache.org/jira/browse/GIRAPH-300
> Project: Giraph
> Issue Type: Improvement
> Reporter: Avery Ching
> Assignee: Avery Ching
> Attachments: GIRAPH-300.2.patch, GIRAPH-300.patch
>
>
> * Upgrade to the most recent stable version of Netty (3.5.3.Final)
> * Try multiple connection attempts up to n failures
> * Track requests throughout the system by keeping track of the request id and
> then matching the request id to the response (minor refactoring of
> WritableRequest to make requests simpler and support the request id)
> * Improved handling of netty exceptions by dumping the exception stack to
> help debug failures
> * Fixes bug in HashWorkerPartitioner by making partitionList thread-safe
> (this causes divide by zero exceptions in real life)
> Currently, netty connection failures causes issues with more than 75 workers
> in my setup. This allows us to reach over 200+ in a reasonably reliable
> network that doesn't kill connections.
> This code passes the local Hadoop regressions and the single node Hadoop
> instance regressions. It also succeeded on large runs (200+ workers) on a
> real Hadoop cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira