wangwei created SINGA-40:
----------------------------
Summary: Support sparse Param update
Key: SINGA-40
URL: https://issues.apache.org/jira/browse/SINGA-40
Project: Singa
Issue Type: New Feature
Reporter: wangwei
For some models, e.g., [Word2Vec|https://code.google.com/p/word2vec/] and
[RNNLM|http://www.fit.vutbr.cz/~imikolov/rnnlm/], their parameters are updated
partially for one iteration. For example, the Word2Vec only updates the rows
(or columns) of the weight matrix corresponding to words that appear in the
current processing sentence.
Currently, when the worker calls Update() function for a Param object. All its
its gradients are sent to the server and all its values are updated. When
applied to Word2Vec model, this would cause a big overhead because most
parameters in Param object are has gradients (or with gradient 0), hence should
not be updated.
To support sparse update, we can create a SparseParam which subclasses the
Param class. It provides APIs for users to set the updating area (e.g., the
range using offset and length, or columns or rows) after computing the
gradients in Layer's ComptueGradient function. The Worker's Update function is
not changed, but the SparseParam will override the message generating functions
to send gradients of the updating area.
Due to parameter sharing, the Stub may also need updates to consider this case
when doing local aggregation. The Updater should also work only on the update
area.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)