Hi everyone,

I have two questions:

*Question 1)* I'm using release 1.1.0 and I'm really confused about the
fact that I'm having massive performance differences in the following
scenario. I need to send one message from each vertex to a subset of its
neighbors (all that satisfy a certain condition). For that, I see two basic
options:

   a) Loop over all edges, making a call to sendMessage(source, target)
whenever target satisfies a condition I want, reusing the same IntWritable
for the target vertex by calling target.set(_)
   b) Loop over all edges, building up an ArrayList (or whatever) of
targets that satisfy the condition, and calling
sendMessageToMultipleMessages(targets) at the end.

Surprisingly, I get much, much worse performance using option (a), which I
would think would be much faster. So I looked in the code and eventually
found my way to SendMessageCache
<https://github.com/apache/giraph/blob/release-1.1/giraph-core/src/main/java/org/apache/giraph/comm/SendMessageCache.java>,
where it turns out that sendMessageToMultipleMessages ->
sendMessageToAllRequest(Iterator, Message) actually just loops over the
iterator, repeatedly calling sendMessageRequest (which is what I thought I
was doing in scenario (a). I might have incorrectly traced the code though.
Can anyone tell me what might be going on? I'm really puzzled by this.

*Question 2) *Is there a good way of sending a vertex's adjacency list to
its neighbors, without building up your own copy of an adjacency list and
then sending that? I'm going through the Edge iterable and building an
ArrayPrimitiveWritable of ids but it would be nice if I could somehow
access the underlying data structure behind the iterable or just wrap the
iterable as a writable somehow.

Thanks so much for the help,
Matthew Saltz

Reply via email to