wangwei created SINGA-57:
----------------------------
Summary: Improve Distributed Hogwild
Key: SINGA-57
URL: https://issues.apache.org/jira/browse/SINGA-57
Project: Singa
Issue Type: Improvement
Reporter: wangwei
The implementation SINGA-8 of distributed Hogwild uses the stub thread to
monitor the network bandwidth. When the network has >0 bandwidth, the stub
sends a sync reminder msg to a server, which would trigger the server to sync
one param slice with other server groups.
The code is messy due to the monitoring of network bandwidth and processing the
sync reminder message. Another problem is that the reminder message may not be
generated frequently. Because it is generated only when the router times out.
If the worker and server run very fast that the router rarely times out, then
the sync reminder message cannot be sent.
This ticket improves the implementation by fixing the frequency of
synchronization between server groups. A server sends a sync message every
sync_freq updates, for the parameter slice it masters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)