-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6600/
-----------------------------------------------------------

Review request for giraph.


Description
-------

* Upgrade to the most recent stable version of Netty (3.5.3.Final)
* Try multiple connection attempts up to n failures
* Track requests throughout the system by keeping track of the request id and 
then matching the request id to the response (minor refactoring of 
WritableRequest to make requests simpler and support the request id)
* Improved handling of netty exceptions by dumping the exception stack to help 
debug failures
* Fixes bug in HashWorkerPartitioner by making partitionList thread-safe (this 
causes divide by zero exceptions in real life)


This addresses bug GIRAPH-300.
    https://issues.apache.org/jira/browse/GIRAPH-300


Diffs
-----

  http://svn.apache.org/repos/asf/giraph/trunk/pom.xml 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyClient.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyServer.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/NettyWorkerClient.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/RequestInfo.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/RequestServerHandler.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/ResponseClientHandler.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendPartitionMessagesRequest.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendPartitionMutationsRequest.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/SendVertexRequest.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/comm/WritableRequest.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/GiraphJob.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/utils/TimedLogger.java
 1372575 
  
http://svn.apache.org/repos/asf/giraph/trunk/src/test/java/org/apache/giraph/comm/ConnectionTest.java
 1372575 

Diff: https://reviews.apache.org/r/6600/diff/


Testing
-------

Currently, netty connection failures causes issues with more than 75 workers in 
my setup. This allows us to reach over 200+ in a reasonably reliable network 
that doesn't kill connections.

This code passes the local Hadoop regressions and the single node Hadoop 
instance regressions. It also succeeded on large runs (200+ workers) on a real 
Hadoop cluster.


Thanks,

Avery Ching

Reply via email to