[ 
https://issues.apache.org/jira/browse/SYSTEMML-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LI Guobao updated SYSTEMML-2299:
--------------------------------
    Description: 
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 
{code:java}
model'=paramserv(model, X, y, X_val, y_val, upd="fun1", agg="fun2", mode="BSP", 
freq="EPOCH", epochs=100, batchsize=64, k=7, scheme=disjoint_contiguous, 
hyperparam=params, checkpoint=NONE){code}
We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism, the data partition scheme, a list of 
additional hyper parameters, as well as the checkpointing strategy. And the 
function will return a trained model in struct format.

*Inputs*:
 * model <list> [: a list consisting of the weight and bias matrices
 * X <matrix>: training features matrix
 * y <matrix>: training label matrix
 * X_val <matrix>: validation features matrix
 * y_val <matrix>: validation label matrix
 * upd <string>: the name of gradient calculation function
 * agg <string>: the name of gradient aggregation function
 * mode <string> (options: BSP, ASP, SSP): the updating mode
 * freq <string> (options: EPOCH, BATCH): the frequence of updates
 * epochs <integer>: the number of epoch
 * batchsize <integer>: the size of batch
 * k <integer>: the degree of parallelism
 * scheme <string> (options: disjoint_contiguous, disjoint_round_robin, 
disjoint_random, overlap_reshuffle): the scheme of data partition, i.e., how 
the data is distributed across workers
 * hyperparam <list>: a list consisting of the additional hyper parameters, 
e.g., learning rate, momentum
 * checkpoint <string> (options: NONE, EPOCH, EPOCH10): the checkpoint 
strategy, we could set a checkpoint for each epoch or each 10 epochs 

  was:
The objective of “paramserv” built-in function is to update an initial or 
existing model with configuration. An initial function signature would be: 

 
{code:java}
model'=paramserv(model, X, y, X_val, y_val, upd=fun1, agg=fun2, mode=BSP, 
freq=EPOCH, epochs=100, batchsize=64, k=7, scheme=disjoint_contiguous, 
hyperparam=params, checkpoint=NONE){code}
 

We are interested in providing the model (which will be a struct-like data 
structure consisting of the weights, the biases and the hyperparameters), the 
training features and labels, the validation features and labels, the batch 
update function (i.e., gradient calculation func), the update strategy (e.g. 
sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch or 
mini-batch), the gradient aggregation function, the number of epoch, the batch 
size, the degree of parallelism as well as the checkpointing strategy (e.g. 
rollback recovery). And the function will return a trained model in struct 
format.


> API design of the paramserv function
> ------------------------------------
>
>                 Key: SYSTEMML-2299
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2299
>             Project: SystemML
>          Issue Type: Sub-task
>            Reporter: LI Guobao
>            Assignee: LI Guobao
>            Priority: Major
>
> The objective of “paramserv” built-in function is to update an initial or 
> existing model with configuration. An initial function signature would be: 
> {code:java}
> model'=paramserv(model, X, y, X_val, y_val, upd="fun1", agg="fun2", 
> mode="BSP", freq="EPOCH", epochs=100, batchsize=64, k=7, 
> scheme=disjoint_contiguous, hyperparam=params, checkpoint=NONE){code}
> We are interested in providing the model (which will be a struct-like data 
> structure consisting of the weights, the biases and the hyperparameters), the 
> training features and labels, the validation features and labels, the batch 
> update function (i.e., gradient calculation func), the update strategy (e.g. 
> sync, async, hogwild!, stale-synchronous), the update frequency (e.g. epoch 
> or mini-batch), the gradient aggregation function, the number of epoch, the 
> batch size, the degree of parallelism, the data partition scheme, a list of 
> additional hyper parameters, as well as the checkpointing strategy. And the 
> function will return a trained model in struct format.
> *Inputs*:
>  * model <list> [: a list consisting of the weight and bias matrices
>  * X <matrix>: training features matrix
>  * y <matrix>: training label matrix
>  * X_val <matrix>: validation features matrix
>  * y_val <matrix>: validation label matrix
>  * upd <string>: the name of gradient calculation function
>  * agg <string>: the name of gradient aggregation function
>  * mode <string> (options: BSP, ASP, SSP): the updating mode
>  * freq <string> (options: EPOCH, BATCH): the frequence of updates
>  * epochs <integer>: the number of epoch
>  * batchsize <integer>: the size of batch
>  * k <integer>: the degree of parallelism
>  * scheme <string> (options: disjoint_contiguous, disjoint_round_robin, 
> disjoint_random, overlap_reshuffle): the scheme of data partition, i.e., how 
> the data is distributed across workers
>  * hyperparam <list>: a list consisting of the additional hyper parameters, 
> e.g., learning rate, momentum
>  * checkpoint <string> (options: NONE, EPOCH, EPOCH10): the checkpoint 
> strategy, we could set a checkpoint for each epoch or each 10 epochs 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to