[
https://issues.apache.org/jira/browse/SYSTEMML-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16499562#comment-16499562
]
Matthias Boehm commented on SYSTEMML-2359:
------------------------------------------
I'm not sure if I understand your question correctly. The slicing of batches
from the local worker's data partition should stay unchanged. In contrast to
updates per batch, we would update the worker's model only locally (without
synchronizing/communicating with the parameter server). For that it might be
good to abstract the aggregation service a bit to make it accessible from both
the workers and param server. Since we don't need to keep the gradients for all
batches, the memory requirements should be the same for per-batch/per-epoch,
but per-epoch requires less synchronization and aggregation overhead which will
be important especially in distributed settings.
> Extend update per EPOCH
> -----------------------
>
> Key: SYSTEMML-2359
> URL: https://issues.apache.org/jira/browse/SYSTEMML-2359
> Project: SystemML
> Issue Type: Sub-task
> Reporter: LI Guobao
> Assignee: LI Guobao
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)