Re: on the semantics of the combiner

2012-01-10 Thread Claudio Martella
it doesn't have to be expand, k, the number of elements returned by the combiner, can still be smaller than n, the size of the messages parameter. as a first example, you can imagine your vertex receiving semantically-different classes/types of messages, and you can imagine willing to be summarizin

Re: on the semantics of the combiner

2012-01-10 Thread Jakob Homan
> it doesn't have to be expand, k, the number of elements returned by > the combiner, can still be smaller than n, Right. Grouping would be the most common case. It would be possible to be great than k, as well. For instance, consider two messages, both generated on the same worker (W) by two tw

Re: on the semantics of the combiner

2012-01-10 Thread Claudio Martella
i'm not sure i understand what you'd save here. if the two messages were going to be expanded to k messages on the destination worker D, but you expand them on W, you end up sending k messages instead of 2. right? On Tue, Jan 10, 2012 at 6:26 PM, Jakob Homan wrote: >> it doesn't have to be expand

Re: on the semantics of the combiner

2012-01-10 Thread Jakob Homan
Those two messages would have gone to D, been expanded to, say, 4, which would have then then been sent to, say, M. This would save the sending of the two to D and send the 4 directly to M. I'm not saying it's a great example, but it is legal. This is of course assuming that combiners can genera

Re: on the semantics of the combiner

2012-01-10 Thread Claudio Martella
Ok, now i see where you're going. I guess that the thing here is that the combiner would "act" like (on its behalf) D, and to do so concretely it would probably need some local data related to D (edges values? vertexvalue?). I also think that k > n is also possible in principle and we could let the

some problems with testing

2012-01-10 Thread Claudio Martella
Hello, I'm having some issues with debugging of GIRAPH-45. Code passes local tests but currently fails testBspCheckpoint(org.apache.giraph.TestManualCheckpoint) testPartitioners(org.apache.giraph.TestGraphPartitioner) The first one is particularly tricky as the autocheckpointing is passed an

Re: on the semantics of the combiner

2012-01-10 Thread Avery Ching
The general idea of combiners is to reduce the number of messages sent. Combiners are purely an optimization and the application should work correctly without it (since it's never guaranteed to actually be called). Combiners can only modify the messages sent to a single vertex, so they can't

Re: on the semantics of the combiner

2012-01-10 Thread Jakob Homan
> Combiners can only modify the messages sent to a single vertex, so they can't > send messages to other vertices. Yeah, the more I've thought about this, the more problematic it would be. These new messages may be generated upon arrival at the destination vertex (since combiners can be run on th

Re: on the semantics of the combiner

2012-01-10 Thread Claudio Martella
I believe the argument of not letting users shoot their foot doesn't stand :) Once you give them any API they have the power to do anything wrong, as they already can with Giraph (or anything else for what it matters), by designing an algorithm wrongly (which would be what it would turn out to be a

Re: on the semantics of the combiner

2012-01-10 Thread Jakob Homan
A composite object would essentially be a wrapper around a list and introduce the need for all vertices to be ready to extract that list at all times. For instance, a combiner passed 10 messages may be able to combine 7 of them but do nothing with the other three, leaving four messages. If we all

Re: on the semantics of the combiner

2012-01-10 Thread Sebastian Schelter
I think we should make the combiner return a list/iterable that can potentially be empty. However we should assume that the number of elements returned is smaller than or equal to the number of input elements (whats the use of a combiner if this is not given?). I also concur that the code should no

Re: some problems with testing

2012-01-10 Thread Claudio Martella
ok, please ignore this last email, i found the right log on the other tasks... :) On Tue, Jan 10, 2012 at 8:10 PM, Claudio Martella wrote: > Hello, > > I'm having some issues with debugging of GIRAPH-45. Code passes local > tests but currently fails > >  testBspCheckpoint(org.apache.giraph.TestMa

on the thread-safety of graph.partition

2012-01-10 Thread Claudio Martella
Hi, as my tests all fail around the same code, i must start thinking there must be a problem there :) though the code is quite simple and the failures happen for different reasons (once is even a divide by zero). Basically i've refactored BasicRPCCommunications.putMsg*(), putVertexIdMessagesList(