mboehm7 commented on pull request #1336:
URL: https://github.com/apache/systemds/pull/1336#issuecomment-886244883
Unfortunately, there are too many problems with this PR (I spent two hours
fixing the remaining issues but there is no end in sight). Please split this PR
into one new PR only for model averaging (and later, once the model averaging
is really working and cleaned up, one PR for nbatches because otherwise all
fixes have to be applied in even more locations). In this process, please
revert all changes of existing tests and formatting changes, only add targeted
new tests for your new features, and try to limit code changes to what is
really needed for integrating model averaging as an optional feature.
The best path forward is to keep this PR open, and cherry pick code changes
into new PRs and fix the issues. Note that Statement.PS_MODELS is unnecessary
and should be removed, the gradient accumulation should only be done if no
model averaging is used, the division of model averaging did not take affect
(because it was not applied in place and new blocks got discarded in the lambda
function), computeBatch always pushed the gradients, the federated control
thread (per worker) always pushed the gradients. Also try to keep the core code
path as simple as possible: e.g., you can do something like the following:
```
if (localUpdate | _modelAvg )
params = updateModel(params, gradients, i, j, batchIter);
...
// Push the gradients or model to ps
pushGradients(_modelAvg ? params : accGradients.get());
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]