[
https://issues.apache.org/jira/browse/GIRAPH-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486468#comment-13486468
]
Eli Reisman commented on GIRAPH-388:
------------------------------------
I agree completely with Claudio on the client-side combiner, the MR comparision
frames the difference perfectly. And I agree on your great work, I love seeing
simplifications like this. It makes sense why it works so well.
My concern with the social graph is that its lumpy and I saw all sorts of
performance degradation and general behavior quirks running jobs with a social
graph that I just never saw with the benchmarks. The kind of duplication
problems I'm talking about might happen when a supernode belongs to the
receiving partition for some messages, etc. Either way, I'm glad the results
were positive, nice job again!
The flushing issue bothered me while I was working on 328 and 322 as well, it
needs to happen for in-memory use cases I'm most familiar with, but it works at
cross-purposes to deduplicating combinable messages. I'd love to see Giraph get
smarter about tuning the data-per-flush parameters in general, these are tricky
to tune per-job and have a large effect on performance. Users I know have lost
hope trying to tune these sorts of params by hand.
> Improve the way we keep outgoing messages
> -----------------------------------------
>
> Key: GIRAPH-388
> URL: https://issues.apache.org/jira/browse/GIRAPH-388
> Project: Giraph
> Issue Type: Improvement
> Reporter: Maja Kabiljo
> Assignee: Maja Kabiljo
> Attachments: GIRAPH-388.patch
>
>
> As per discussion on GIRAPH-357, in standard application chances that we get
> to use client-side combiner are very low. I experimented with benefits which
> we can get from not having the client-side combiner at all. It turns out that
> having a lot of maps in SendMessageCache, and then collection inside each of
> them, really hurts the performance.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira