wangwei created SINGA-210:
-----------------------------
Summary: Eanble checkpoint and resume for v1.0
Key: SINGA-210
URL: https://issues.apache.org/jira/browse/SINGA-210
Project: Singa
Issue Type: New Feature
Reporter: wangwei
This ticket is going to add code for dumping the model parameters as checkpoint
files, which could be used for fine-tuning and deployment.
The model parameters should be separated from model definition, i.e., net
construction. Users either random initialize the layer parameters or using the
parameters from checkpoint files after creating the neural net. In other words,
we do not add a pair of serializing and parsing functions in the Layer class.
We need to decide the format of the checkpoint file and how to write and read
it:
1. the checkpoint file consists of the model parameters, which could be
serialized as key-value pairs, where the key is the parameter name and value is
a protobuf object including the shape and values. Optionally, there could be a
text file including the parameter meta info, e..g, name and shape, which would
be useful for users to know the model parameters without parsing the binary
checkpoint file.
2. the binary checkpoint file can be serialized using the Writer SINGA-202 and
loaded into memory using the Reader (SINGA-202).
3. A checkpoint utility class should be implemented for 1 and 2. Compatibility
with caffe checkpoint files may also be considered to re-use models from caffe
model zoo http://caffe.berkeleyvision.org/model_zoo.html.
{code}
class Checkpoint {
// <prefix>.model is the binary file for parameter key-value pair;
// <prefix>.meta is the text file, one line per parameter.
Checkpoint(prefix, mode=[R|W]);
Read(); // read .model
ReadMeta() ; // read meta only
Get(key); // return the value protobuf obj.
GetMeta(key);
Read(key);
Write(key, value); // write to both .model and .meta files.
};
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)