[
https://issues.apache.org/jira/browse/GIRAPH-250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eli Reisman updated GIRAPH-250:
-------------------------------
Attachment: GIRAPH-250-1.patch
> Let workers contending for InputSplits during INPUT_SUPERSTEP guess better,
> choose quicker.
> -------------------------------------------------------------------------------------------
>
> Key: GIRAPH-250
> URL: https://issues.apache.org/jira/browse/GIRAPH-250
> Project: Giraph
> Issue Type: Improvement
> Components: bsp, graph, zookeeper
> Affects Versions: 0.2.0
> Reporter: Eli Reisman
> Assignee: Eli Reisman
> Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-250-1.patch
>
>
> In the job logs it has become clear that workers trying to scan for
> master-created Znodes indicating an InputSplit is available to claim (and
> read) are starting very similar lists of znode names to scan (iterating from
> 0 through the list all at the same time)
> what you see in the logs is lots of misses, followed by finally a hit
> somewhere. By using iterating the list, but starting from a different spot
> for each worker (see the patch its a simple change using the hash code of the
> worker hostname + index and mod that by the size of the list of possible
> splits to claim) we (mostly) iterate starting from different parts of the
> input split list each worker gets, thereby lowering contention dramatically
> and ensuring everyone will more quickly claim (at least their first) input
> split. This seems to work very well so far.
> passes mvn verify etc.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira