Eli Reisman created GIRAPH-250:
----------------------------------

             Summary: Let workers contending for InputSplits during 
INPUT_SUPERSTEP guess better, choose quicker.
                 Key: GIRAPH-250
                 URL: https://issues.apache.org/jira/browse/GIRAPH-250
             Project: Giraph
          Issue Type: Improvement
          Components: bsp, graph, zookeeper
    Affects Versions: 0.2.0
            Reporter: Eli Reisman
            Assignee: Eli Reisman
            Priority: Minor
             Fix For: 0.2.0
         Attachments: GIRAPH-250-1.patch

In the job logs it has become clear that workers trying to scan for 
master-created Znodes indicating an InputSplit is available to claim (and read) 
are starting very similar lists of znode names to scan (iterating from 0 
through the list all at the same time)

what you see in the logs is lots of misses, followed by finally a hit 
somewhere. By using iterating the list, but starting from a different spot for 
each worker (see the patch its a simple change using the hash code of the 
worker hostname + index and mod that by the size of the list of possible splits 
to claim) we (mostly) iterate starting from different parts of the input split 
list each worker gets, thereby lowering contention dramatically and ensuring 
everyone will more quickly claim (at least their first) input split. This seems 
to work very well so far.

passes mvn verify etc.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to