[ 
https://issues.apache.org/jira/browse/GIRAPH-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486468#comment-13486468
 ] 

Eli Reisman commented on GIRAPH-388:
------------------------------------

I agree completely with Claudio on the client-side combiner, the MR comparision 
frames the difference perfectly. And I agree on your great work, I love seeing 
simplifications like this. It makes sense why it works so well.

My concern with the social graph is that its lumpy and I saw all sorts of 
performance degradation and general behavior quirks running jobs with a social 
graph that I just never saw with the benchmarks. The kind of duplication 
problems I'm talking about might happen when a supernode belongs to the 
receiving partition for some messages, etc. Either way, I'm glad the results 
were positive, nice job again!

The flushing issue bothered me while I was working on 328 and 322 as well, it 
needs to happen for in-memory use cases I'm most familiar with, but it works at 
cross-purposes to deduplicating combinable messages. I'd love to see Giraph get 
smarter about tuning the data-per-flush parameters in general, these are tricky 
to tune per-job and have a large effect on performance. Users I know have lost 
hope trying to tune these sorts of params by hand.

                
> Improve the way we keep outgoing messages
> -----------------------------------------
>
>                 Key: GIRAPH-388
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-388
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-388.patch
>
>
> As per discussion on GIRAPH-357, in standard application chances that we get 
> to use client-side combiner are very low. I experimented with benefits which 
> we can get from not having the client-side combiner at all. It turns out that 
> having a lot of maps in SendMessageCache, and then collection inside each of 
> them, really hurts the performance. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to