[ https://issues.apache.org/jira/browse/GIRAPH-273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440675#comment-13440675 ]
Maja Kabiljo commented on GIRAPH-273: ------------------------------------- We actually ended up with something better than aggregation tree. Say we have A aggregators and W workers. With the tree approach the whole aggregation would last for: A * (aggregation_time + transfer_time) * log W What we can do is perform aggregations in a completely distributed way. Each aggregator would have a worker which owns it and which does aggregation for it, so we would end up with about: A * (aggregation_time + transfer_time) After performing aggregations, all workers would send the final values to master, and after master.compute aggregators would go back the same way. In case of applications without master compute, we can even skip sending aggregated values to master all together. Is having all the workers connect to master an issue? Master will have the same number of connections as any other worker has, and in this approach we just send smaller amount of data through each of the connections, instead of having that same amount sent through just two. > Aggregators shouldn't use Zookeeper > ----------------------------------- > > Key: GIRAPH-273 > URL: https://issues.apache.org/jira/browse/GIRAPH-273 > Project: Giraph > Issue Type: Improvement > Reporter: Maja Kabiljo > Assignee: Maja Kabiljo > > We use Zookeeper znodes to transfer aggregated values from workers to master > and back. Zookeeper is supposed to be used for coordination, and it also has > a memory limit which prevents users from having aggregators with large value > objects. These are the reasons why we should implement aggregators gathering > and distribution in a different way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira