If there is some sort of distributed cache we can have the list there.


Στις 30/9/2013 1:13 μμ, ο/η Anastasis Andronidis έγραψε:
I quote from the JIRA issue:

Ilias Kapouranis added a comment
I don't think it would be much of an issue.
        • We have the List where we keep all the aggregators.
        • When executeAggregator(int aggrIndex) is called, we move the 
aggrIndex to a new List (say tempList) which keeps a pair (aggrIndex,aggrClass).
        • At the end of the superstep, if tempList is empty then all the 
aggregators will be executed, else only those which are in it.
        • When all aggregators have finished, we move the pairs from tempList 
to the main List and we put the aggregators to their previous indexes.
Hope this helps.
I totally agree that this is the case in a higher level. The problem is that 
the implementation is not that simple.

Every node (a machine let's say) that is running in the distributed environment 
has a BSP peer that runs as a local instance. In every BSP peer, vertices 
execute their code. This means that when you ask for an aggregator not to run 
in a specific vertex, this invocation happens only in 1 node. You need to sync 
all other nodes not to run the same aggregator and in the end also skip the 
master aggregator. This is a little bit tricky, because it is very depended on 
the implementation of the software you use (in this case Hama).

Of course, if your code is exactly the same in every vertex, every peer will 
have a local invoke of skipping their aggregators and no sync is needed. But as 
it's not always the case we need to plan for the first scenario as well.

If you have any questions, or something is not clear. Please reply.

Cheers,
Anastasis

Reply via email to