[
https://issues.apache.org/jira/browse/SINGA-8?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601208#comment-14601208
]
ASF subversion and git services commented on SINGA-8:
-----------------------------------------------------
Commit 51d4c2aec4f9ddedeff588a916b25a0041dd8a88 in incubator-singa's branch
refs/heads/master from wang wei
[ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=51d4c2a ]
SINGA-8 Implement distributed Hogwild
Have replaced hard-code enpoints with RegistPocs() and GetProcHost()
implemented with the help of zookeeper.
TODO slice large Param objects in a separate branch.
> Implement distributed Hogwild
> -----------------------------
>
> Key: SINGA-8
> URL: https://issues.apache.org/jira/browse/SINGA-8
> Project: Singa
> Issue Type: New Feature
> Reporter: wangwei
> Assignee: wangwei
> Labels: distributed, features, hogwild
>
> Generally, both the Downpour framework of Google Brain [1] and the Caffe's
> distributed Hogwild implementation are extensions of the shared memory
> Hogwild training. In this ticket, we refer to the second one.
> In specific, each server group masters a subset of parameters (i.e., Param
> objects) when synchronizing with other server groups. It aggregates all
> updates for its subset and sends back (e.g., broadcast) the updated
> parameters back to all other server groups. The synchronization is conducted
> asynchronously. The frequency can be fixed in the first implementations.
> Finally, it should be tuned automatically to fully utilize the network
> bandwidth.
> [1]J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, Q. V. Le, M. Z. Mao, M.
> Ranzato, A. W. Senior, P. A. Tucker, K. Yang, and A. Y. Ng. Large scale
> distributed deep networks. In NIPS, pages 1232{1240, 2012.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)