Eli Reisman created GIRAPH-250:
----------------------------------
Summary: Let workers contending for InputSplits during
INPUT_SUPERSTEP guess better, choose quicker.
Key: GIRAPH-250
URL: https://issues.apache.org/jira/browse/GIRAPH-250
Project: Giraph
Issue Type: Improvement
Components: bsp, graph, zookeeper
Affects Versions: 0.2.0
Reporter: Eli Reisman
Assignee: Eli Reisman
Priority: Minor
Fix For: 0.2.0
Attachments: GIRAPH-250-1.patch
In the job logs it has become clear that workers trying to scan for
master-created Znodes indicating an InputSplit is available to claim (and read)
are starting very similar lists of znode names to scan (iterating from 0
through the list all at the same time)
what you see in the logs is lots of misses, followed by finally a hit
somewhere. By using iterating the list, but starting from a different spot for
each worker (see the patch its a simple change using the hash code of the
worker hostname + index and mod that by the size of the list of possible splits
to claim) we (mostly) iterate starting from different parts of the input split
list each worker gets, thereby lowering contention dramatically and ensuring
everyone will more quickly claim (at least their first) input split. This seems
to work very well so far.
passes mvn verify etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira