To clarify, I was not discussing the possibility for combine to return null. I see why it would be useful, given that combine returns M, there's no other way to let combiner ask not to send any message, although i agree with Jakob, I also believe returning null should be avoided but only used, roughly, as an init value for a reference/pointer. Perhaps, we could, but i'm just thinking out loud here, let combine() return Iterable<M>, basicallly letting it define what to combine to ({0, 1, k } messages). It would be a powerful extension to the model, but maybe it's too much.
As far as the size of the messages parameter, I agree with you that 0 messages gives nothing to combine and it would be somehow awkward, it was more a matter of synching it with the other methods getting the messages parameter. Probably, having a more clear javadoc will do the job here. What do you think? On Mon, Jan 9, 2012 at 8:42 PM, Jakob Homan <jgho...@gmail.com> wrote: > I'm not a big fan of returning null as it adds extra complexity to the > calling code (null checks, or not, since people usually will forget > them). Avery is correct that combiners are application specific. Is > it conceivable that one would want to write a combiner that returned > something for an input of no parameters, ie combining the empty list > doesn't return the empty list? I imagine for most combiners, > combining a single message would result in that message. > > On Mon, Jan 9, 2012 at 11:28 AM, Avery Ching <ach...@apache.org> wrote: >> The javadoc for VertexCombiner#combine() is >> >> /** >> * Combines message values for a particular vertex index. >> * >> * @param vertexIndex Index of the vertex getting these messages >> * @param msgList List of the messages to be combined >> * @return Message that is combined from {@link MsgList} or null if no >> * message it to be sent >> * @throws IOException >> */ >> >> I think we are somewhat vague on what a combiner can return to support >> various use cases. A combiner should be particular to a particular >> compute() algorithm. I think it should be legal to return null from a >> combiner, in that case, no message should be sent to that vertex. >> >> It seems like it would be an overhead to call a combiner when there are 0 >> messages. I can't see a case where that would be useful. Perhaps we should >> change the javadoc to insure that msgList must contain at least one message >> to have combine() being called. >> >> Avery >> >> >> On 1/9/12 5:37 AM, Claudio Martella wrote: >>> >>> Hi Sebastian, >>> >>> yes, that was my point, I agree completely with you. >>> Fixing my test was not the issue, my question was whether we want to >>> define explicitly the semantics of this scenario. >>> Personally, I believe the combiner should be ready to receive 0 >>> messages, as it's the case of BasicVertex::initialize(), putMessages() >>> and compute(), and act accordingly. >>> >>> In the particular example, I believe the SimpleSumCombiner is bugged. >>> It's true that the sum of no values is 0, but it's also true that the >>> null return semantics of combine() is more suitable for this exact >>> situation. >>> >>> >>> On Mon, Jan 9, 2012 at 2:21 PM, Sebastian Schelter<s...@apache.org> wrote: >>>> >>>> I think we currently implicitly assume that there is at least one >>>> element in the Iterable passed to the combiner. The messaging code only >>>> invokes the combiner only if at least one message for the target vertex >>>> has been sent. >>>> >>>> However, we should not rely on implicit implementation details but >>>> explicitly specify the semantics of combiners. >>>> >>>> --sebastian >>>> >>>> On 09.01.2012 13:29, Claudio Martella wrote: >>>>> >>>>> Hello list, >>>>> >>>>> for GIRAPH-45 I'm touching the incoming messages and hit an >>>>> interesting problem with the combiner semantics. >>>>> currently, my code fails testBspCombiner for the following reason: >>>>> >>>>> SimpleSumCombiner::compute() returns a value even if there are no >>>>> messages in the iterator (in this case it returns 0) and for this >>>>> reason the vertices get activated at each superstep. >>>>> >>>>> At each superstep, under-the-hood, I pass the combiner for each vertex >>>>> an Iterable, which can be empty: >>>>> >>>>> public Iterable<M> getMessages(I vertexId) { >>>>> Iterable<M> messages = inMessages.getMessages(vertexId); >>>>> if (combiner != null) { >>>>> M combinedMsg; >>>>> try { >>>>> combinedMsg = combiner.combine(vertexId, >>>>> messages); >>>>> } catch (IOException e) { >>>>> throw new RuntimeException("could not combine", >>>>> e); >>>>> } >>>>> if (combinedMsg != null) { >>>>> List<M> tmp = new ArrayList<M>(1); >>>>> tmp.add(combinedMsg); >>>>> messages = tmp; >>>>> } else { >>>>> messages = new ArrayList<M>(0); >>>>> } >>>>> } >>>>> return messages; >>>>> } >>>>> >>>>> the Iterable returned by this methods is passed to >>>>> basicVertex.putMessages() right before the compute(). >>>>> Now, the question is: who's wrong? The combiner code that returns a >>>>> sum of 0 over no values, or the framework that calls the combiner with >>>>> 0 messages? >>>>> >>>>> >>>>> >>> >>> >> -- Claudio Martella claudio.marte...@gmail.com