-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6676/
-----------------------------------------------------------

Review request for giraph.


Description
-------

When calling sendPartitionRequest(), we clear the vertex list afterward, making 
it a race!

I noticed this when I was running with 300 workers and the number of edges 
wasn't what I expected. Sometimes we get empty requests!

After digging into the code I found the issue and have fixed it.

Giraph Stats Aggregate edges 99,971,220 0 99,971,220
Superstep 11 0 11
Current workers 300 0 300
Last checkpointed superstep 0 0 0
Current master task partition 0 0 0
Sent messages 0 0 0
Aggregate finished vertices 10,000,000 0 10,000,000
Aggregate vertices 10,000,000 0 10,000,000

This is wrong!

Giraph Stats Aggregate edges 100,000,000 0 100,000,000
Superstep 11 0 11
Last checkpointed superstep 0 0 0
Current workers 300 0 300
Current master task partition 0 0 0
Sent messages 0 0 0
Aggregate finished vertices 10,000,000 0 10,000,000
Aggregate vertices 10,000,000 0 10,000,000

Fixed!

Also added a few messages for better debugging.


This addresses bug GIRAPH-302.
    https://issues.apache.org/jira/browse/GIRAPH-302


Diffs
-----

  
http://svn.apache.org/repos/asf/giraph/trunk/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
 1373682 

Diff: https://reviews.apache.org/r/6676/diff/


Testing
-------

Passed unittests and verified on a real cluster using 300 machines.


Thanks,

Avery Ching

Reply via email to