a general way of collecting data from all the vertices is using an
Aggregator. An aggregator collects messages from all the vertices (who
decide to write to it) and it can be read by all the vertices. You
could easily implement your statistics from there. Aggregators are
computed both on the workers and on the master, so it could be quite

Hope it helps,

On Tue, Dec 20, 2011 at 4:29 PM,  <listenbru...@gmx.net> wrote:
> Hi all,
> a plan to use Giraph for a use case where nodes send messages depending on 
> some global distribution of a value. For instance, nodes have a numeric 
> value. Thus there is a global distribution of that value. Now I want all 
> nodes to take an action, i.e., send messages, that have a value in say the 
> top 1% of all values.
> How could I do this?
> Thinking in Hadoop MapReduce I'd use the distributed cache in order to 
> maintain a fingerprint of the global distribution.
> Would this work in giraph too?
> Thanks and BR!
> christoph

   Claudio Martella

Reply via email to