> On Nov. 9, 2012, 1:59 a.m., Alessandro Presta wrote: > > The results look good. How does the no-combiner version fare in these same > > benchmarks? > > > > Frankly, at this point I would just go for one combiner interface (the > > binary one). > > No need to use it in the disk-backed message store (if we have one message > > per vertex, there is no point in spilling to disk). > > > > The only use I can think of for the Iterable->Iterable form is when you're > > doing some sort of top-k computation, but I'm not sure if we would reap any > > benefits when the graph is sparse. > > > > Regarding the interface, I think two sensible options are: > > 1) M combine(M a, M b) > > 2) M combine(Iterable<M> messages) > > > > Why 2)? We may find that when processing a list of incoming messages, > > calling combine() for each of them has an overhead, whereas a loop inside > > combine() would be faster. > > However, if we also use the binary combiner on the sender (which may be a > > good idea if it's fast enough), all the requests will have at most one > > message per vertex, so then 1) would be the way to go (assuming we call the > > combiner each time we process a request). > > Maja Kabiljo wrote: > The way messaging is implemented right now, we don't have a list of > messages per vertex. Please take a look at SendWorkerMessagesRequest. So > there is no iterative combiner calls. > This was discussed before, we get the chance to use the combiner on the > client side very rarely, so benefit of having just lists instead of maps on > the client side is much bigger then the one we could get from combiner. > > Maja Kabiljo wrote: > Forgot to mention, with PageRank no-combiner is slower than > MultiCombiner. With our internal application for some reason no-combiner is > slightly faster than MultiCombiner, but slower than BinaryCombiner.
Oh ok, I forgot that we switched to a list of (id, message) pairs on the client. > On Nov. 9, 2012, 1:59 a.m., Alessandro Presta wrote: > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BinaryCombiner.java, > > line 56 > > <https://reviews.apache.org/r/7975/diff/1/?file=187405#file187405line56> > > > > We could get rid of this method by using the following logic when > > processing messages: > > - if the current message is null, just set it to the incoming one > > - otherwise, combine them > > Maja Kabiljo wrote: > The issue here is that we can't always modify the message which message > store receives, because of local requests and sendMessageToAllEdges which > keeps just one copy of message. Got it. We could make a copy of the first message then. - Alessandro ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7975/#review13268 ----------------------------------------------------------- On Nov. 9, 2012, 1:44 a.m., Maja Kabiljo wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7975/ > ----------------------------------------------------------- > > (Updated Nov. 9, 2012, 1:44 a.m.) > > > Review request for giraph. > > > Description > ------- > > Current combiner interface is very general, but also doesn't provide the best > performance. All the combiners we currently have are binary combiners, i.e. > they can combine two messages into one. Having a lists around this simple > concept makes it slower and requires more object creations. > Adding BinaryCombiner, and a specialized message store which will be used > with it. This message store has only one message per vertex instead of having > a collection per vertex. > > > This addresses bug GIRAPH-414. > https://issues.apache.org/jira/browse/GIRAPH-414 > > > Diffs > ----- > > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/GiraphRunner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/ImmutableClassesGiraphConfiguration.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/CollectionOfMessagesPerVertexStore.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStore.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/OneMessagePerVertexStore.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/SimpleMessageStore.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/DoubleSumBinaryCombiner.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/DoubleSumCombiner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/MinimumDoubleCombiner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/MinimumIntCombiner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleSumCombiner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BinaryCombiner.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspUtils.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/Combiner.java > PRE-CREATION > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GiraphTypeValidator.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/VertexCombiner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/TestVertexTypes.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/ConnectionTest.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestFailureTest.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestTest.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/SaslConnectionTest.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/TestMessageStores.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/examples/MinimumIntCombinerTest.java > 1406748 > > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/utils/MockUtils.java > 1406748 > > Diff: https://reviews.apache.org/r/7975/diff/ > > > Testing > ------- > > mvn verify > > PageRankBenchmark > 20m vertices, 100 edges per vertex, 20 workers > 1 compute thread, superstep time 55s->45s > 6 compute threads, superstep time 28s->15s > 12 compute threads, 1 netty server thread, superstep time 185s->112s > > Our internal application > Similar speedup as Page Rank > > > Thanks, > > Maja Kabiljo > >
