Repository: incubator-singa Updated Branches: refs/heads/master e385d2a81 -> 1cfdac6c3
SINGA-125 Improve Python Helper - Update README.md - Update layer.py and model.py . deal with non-square values for kernel, stride, pad . users can specify Accuracy layer by 'show_acc=True' - Update cifar10 examples . set momentum to 0.9 to speed up convergence Project: http://git-wip-us.apache.org/repos/asf/incubator-singa/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-singa/commit/4662dc3e Tree: http://git-wip-us.apache.org/repos/asf/incubator-singa/tree/4662dc3e Diff: http://git-wip-us.apache.org/repos/asf/incubator-singa/diff/4662dc3e Branch: refs/heads/master Commit: 4662dc3e363f30b352d06e962c98c5028ad2d688 Parents: e385d2a Author: chonho <[email protected]> Authored: Wed Jan 6 10:39:56 2016 +0800 Committer: chonho <[email protected]> Committed: Thu Jan 14 11:17:59 2016 +0800 ---------------------------------------------------------------------- tool/python/README.md | 242 +++++++++++++------------ tool/python/examples/cifar10_cnn.py | 2 +- tool/python/examples/cifar10_cnn_cudnn.py | 2 +- tool/python/singa/layer.py | 52 ++++-- tool/python/singa/model.py | 19 +- 5 files changed, 189 insertions(+), 128 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/4662dc3e/tool/python/README.md ---------------------------------------------------------------------- diff --git a/tool/python/README.md b/tool/python/README.md index 02e7fd1..e383cfb 100644 --- a/tool/python/README.md +++ b/tool/python/README.md @@ -1,6 +1,8 @@ -## SINGA-81 Add Python Helper, which enables users to construct a model (JobProto) and run Singa in Python +# Python Helper - SINGAROOT/tool/python +Users can construct a model and run SINGA using Python. Specifically, the Python helper enables users to generate JobProto for the model and run Driver::Train or Driver::Test using Python. The Python Helper tool can be found in `SINGA_ROOT/tool/python` consisting of the following directories. + + SINGAROOT/tool/python |-- pb2 (has job_pb2.py) |-- singa |-- model.py @@ -11,79 +13,83 @@ |-- utility.py |-- message.py |-- examples - |-- cifar10_cnn.py, mnist_mlp.py, , mnist_rbm1.py, mnist_ae.py, etc. + |-- cifar10_cnn.py, mnist_mlp.py, mnist_rbm1.py, mnist_ae.py, etc. |-- datasets |-- cifar10.py |-- mnist.py -### How to Run +##1. Basic User Guide + +In order to use the Python Helper features, users need to add the following option when building SINGA as follows. +``` +./configure --enable-python --with-python=PYTHON_DIR +``` +where `PYTHON_DIR` has `Python.h` + +### (a) How to Run ``` bin/singa-run.sh -exec user_main.py ``` -The python code, e.g., user_main.py, would create the JobProto object and pass it to Driver::Train. +The python code, e.g., `user_main.py`, would create the JobProto object and pass it to Driver::Train or Driver:Test. -For example, +For running CIFAR10 example, ``` cd SINGA_ROOT bin/singa-run.sh -exec tool/python/examples/cifar10_cnn.py ``` - -Note that, in order to use the Python Helper feature, users need to add the following option +For running MNIST example, ``` -./configure --enable-python --with-python=PYTHON_DIR +cd SINGA_ROOT +bin/singa-run.sh -exec tool/python/examples/mnist_mlp.py ``` -where PYTHON_DIR has Python.h -### Layer class (inherited) +### (b) Class Description + +#### Layer class + +The following classes configure field values for a particular layer and generate its LayerProto. + +* `Data` for a data layer. +* `Dense` for an innerproduct layer. +* `Activation` for an activation layer. +* `Convolution2D` for a convolution layer. +* `MaxPooling2D` for a max pooling layer. +* `AvgPooling2D` for an average pooling layer. +* `LRN2D` for a normalization (or local response normalization) layer. +* `Dropout` for a dropout layer. -* Data -* Dense -* Activation -* Convolution2D -* MaxPooling2D -* AvgPooling2D -* LRN2D -* Dropout -* RBM -* Autoencoder +In addition, the following classes generate multiple layers for particular models. -### Model class +* `RBM` for constructing layers of RBM. +* `Autoencoder` for constructing layers of Autoencoder -* Model class has `jobconf` (JobProto) and `layers` (layer list) + +#### Model class + +Model class has `jobconf` (JobProto) and `layers` (a layer list). Methods in Model class -* add - * add Layer into Model - * 2 subclasses: Sequential model and Energy model +* `add` to add Layer into the model + * 2 subclasses: `Sequential` model and `Energy` model -* compile - * set Updater (i.e., optimizer) and Cluster (i.e., topology) components +* `compile` to configure an optimizer and topology for training. + * set `Updater` (i.e., optimizer) and `Cluster` (i.e., topology) components -* fit +* `fit` to configure field values for training. * set Training data and parameter values for the training * (optional) set Validatiaon data and parameter values - * set Train_one_batch component - * specify `with_test` field if a user wants to run singa with test data simultaneously. - * [TODO] recieve train/validation results, e.g., accuracy, loss, ppl, etc. - -* evaluate - * set Testing data and parameter values for the testing - * specify `checkpoint_path` field if a user want to run singa only for testing. - * [TODO] recieve test results, e.g., accuracy, loss, ppl, etc. + * set `Train_one_batch` component + * set `with_test` argument `True` if users want to run SINGA with test data simultaneously. + * return train/validation results, e.g., accuracy, loss, ppl, etc. -#### Results +* `evaluate` to configure field values for test. + * set Testing data and parameter values for the test + * specify `checkpoint_path` field if users want to run SINGA only for test. + * return test results, e.g., accuracy, loss, ppl, etc. -fit() and evaluate() return train/test results, a dictionary containing - -* [key]: step number -* [value]: a list of dictionay - * 'acc' for accuracy - * 'loss' for loss - * 'ppl' for ppl - * 'se' for squred error -#### To run Singa on GPU +### (c) To Run Singa on GPU Users need to set a list of gpu ids to `device` field in fit() or evaluate(). @@ -94,60 +100,53 @@ m.fit(X_train, nb_epoch=100, with_test=True, device=gpu_id) ``` -### Parameter class +### (d) How to set/update parameter values -Users need to set parameter and initial values. For example, +Users may need to set/update parameter field values. -* Parameter (fields in Param proto) - * lr = (float) // learning rate multiplier, used to scale the learning rate when updating parameters. - * wd = (float) // weight decay multiplier, used to scale the weight decay when updating parameters. +* Parameter fields for both Weight and Bias (i.e., fields of ParamProto) + * `lr` = (float) : learning rate multiplier, used to scale the learning rate when updating parameters. + * `wd` = (float) : weight decay multiplier, used to scale the weight decay when updating parameters. -* Parameter initialization (fields in ParamGen proto) - * init = (string) // one of the types, 'uniform', 'constant', 'gaussian' - * high = (float) // for 'uniform' - * low = (float) // for 'uniform' - * value = (float) // for 'constant' - * mean = (float) // for 'gaussian' - * std = (float) // for 'gaussian' +* Parameter initialization (fields of ParamGenProto) + * `init` = (string) : one of the types, 'uniform', 'constant', 'gaussian' + * `scale` = (float) : for 'uniform', it is used to set `low`=-scale and `high`=+scale + * `high` = (float) : for 'uniform' + * `low` = (float) : for 'uniform' + * `value` = (float) : for 'constant' + * `mean` = (float) : for 'gaussian' + * `std` = (float) : for 'gaussian' -* Weight (`w_param`) is 'gaussian' with mean=0, std=0.01 at default +* Weight (`w_param`) is set as 'gaussian' with `mean`=0 and `std`=0.01 at default. -* Bias (`b_param`) is 'constant' with value=0 at default +* Bias (`b_param`) is set as 'constant' with `value`=0 at default. -* How to update the parameter fields - * for updating Weight, put `w_` in front of field name - * for updating Bias, put `b_` in front of field name +* In order to set/update the parameter fields of either Weight or Bias + * for Weight, put `w_` in front of field name + * for Bias, put `b_` in front of field name -Several ways to set Parameter values -``` -parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1) -parb = Parameter(lr=1, wd=0, init='constant', value=0) -m.add(Convolution2D(10, w_param=parw, b_param=parb, ...) -``` -``` -m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...) -``` -``` -parw = Parameter(init='constant', mean=0) -m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...) -``` + For example, + ``` + m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...) + ``` +### (e) Results -#### Other classes +fit() and evaluate() return training/test results, i.e., a dictionary containing -* Store -* Algorithm -* Updater -* SGD -* AdaGrad -* Cluster +* [key]: step number +* [value]: a list of dictionay + * 'acc' for accuracy + * 'loss' for loss + * 'ppl' for ppl + * 'se' for squred error -## MLP Example -An example (to generate job.conf for mnist) +## 2. Examples +### MLP example (to generate job.conf for MNIST) ``` X_train, X_test, workspace = mnist.load_data() @@ -167,10 +166,7 @@ m.fit(X_train, nb_epoch=1000, with_test=True) result = m.evaluate(X_test, batch_size=100, test_steps=10, test_freq=60) ``` -## CNN Example - -An example (to generate job.conf for cifar10) - +### CNN example (to generate job.conf for cifar10) ``` X_train, X_test, workspace = cifar10.load_data() @@ -199,8 +195,7 @@ m.fit(X_train, nb_epoch=1000, with_test=True) result = m.evaluate(X_test, 1000, test_steps=30, test_freq=300) ``` - -## RBM Example +### RBM Example ``` rbmid = 3 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid) @@ -215,7 +210,7 @@ m.compile(optimizer=sgd, cluster=topo) m.fit(X_train, alg='cd', nb_epoch=6000) ``` -## AutoEncoder Example +### AutoEncoder Example ``` rbmid = 4 X_train, X_test, workspace = mnist.load_data(nb_rbm=rbmid+1) @@ -230,7 +225,47 @@ m.compile(loss='mean_squared_error', optimizer=agd, cluster=topo) m.fit(X_train, alg='bp', nb_epoch=12200) ``` -### TIPS + +## 3. Advanced User Guide + +### Parameter class + +Users can explicitly set/update parameter. There are several ways to set Parameter values +``` +parw = Parameter(lr=2, wd=10, init='gaussian', std=0.1) +parb = Parameter(lr=1, wd=0, init='constant', value=0) +m.add(Convolution2D(10, w_param=parw, b_param=parb, ...) +``` +``` +m.add(Dense(10, w_mean=1, w_std=0.1, w_lr=2, w_wd=10, ...) +``` +``` +parw = Parameter(init='constant', mean=0) +m.add(Dense(10, w_param=parw, w_lr=1, w_wd=1, b_value=1, ...) +``` + +### Data layer + +There are alternative ways to add Data layer. In addition, users can write your own `load_data` method of `cifar10.py` and `mnist.py` in `examples/dataset`. +``` +X_train, X_test = mnist.load_data() // parameter values are set in load_data() +m.fit(X_train, ...) // Data layer for training is added +m.evaluate(X_test, ...) // Data layer for testing is added +``` +``` +X_train, X_test = mnist.load_data() // parameter values are set in load_data() +m.add(X_train) // explicitly add Data layer +m.add(X_test) // explicitly add Data layer +``` +``` +store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly +m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added +store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly +m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added +``` + + +### Other TIPS Hidden layers for MLP can be written as ``` @@ -281,26 +316,9 @@ m.add(Dense(10, w_param=parw, w_wd=250, b_param=parb, b_lr=2, b_wd=0, activation ``` -Alternative ways to add Data layer -``` -X_train, X_test = mnist.load_data() // parameter values are set in load_data() -m.fit(X_train, ...) // Data layer for training is added -m.evaluate(X_test, ...) // Data layer for testing is added -``` -``` -X_train, X_test = mnist.load_data() // parameter values are set in load_data() -m.add(X_train) // explicitly add Data layer -m.add(X_test) // explicitly add Data layer -``` -``` -store = Store(path='train.bin', batch_size=64, ...) // parameter values are set explicitly -m.add(Data(load='recordinput', phase='train', conf=store)) // Data layer is added -store = Store(path='test.bin', batch_size=100, ...) // parameter values are set explicitly -m.add(Data(load='recordinput', phase='test', conf=store)) // Data layer is added -``` -### Cases to run singa +### Different Cases to Run SINGA (1) Run singa for training ``` http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/4662dc3e/tool/python/examples/cifar10_cnn.py ---------------------------------------------------------------------- diff --git a/tool/python/examples/cifar10_cnn.py b/tool/python/examples/cifar10_cnn.py index f03b611..8d4e778 100755 --- a/tool/python/examples/cifar10_cnn.py +++ b/tool/python/examples/cifar10_cnn.py @@ -47,7 +47,7 @@ m.add(AvgPooling2D(pool_size=(3,3), stride=2)) m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax')) -sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) +sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) topo = Cluster(workspace) m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo) m.fit(X_train, nb_epoch=1000, with_test=True) http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/4662dc3e/tool/python/examples/cifar10_cnn_cudnn.py ---------------------------------------------------------------------- diff --git a/tool/python/examples/cifar10_cnn_cudnn.py b/tool/python/examples/cifar10_cnn_cudnn.py index e87b5c4..e243834 100755 --- a/tool/python/examples/cifar10_cnn_cudnn.py +++ b/tool/python/examples/cifar10_cnn_cudnn.py @@ -47,7 +47,7 @@ m.add(AvgPooling2D(pool_size=(3,3), stride=2)) m.add(Dense(10, w_wd=250, b_lr=2, b_wd=0, activation='softmax')) -sgd = SGD(decay=0.004, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) +sgd = SGD(decay=0.004, momentum=0.9, lr_type='manual', step=(0,60000,65000), step_lr=(0.001,0.0001,0.00001)) topo = Cluster(workspace) m.compile(loss='categorical_crossentropy', optimizer=sgd, cluster=topo) http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/4662dc3e/tool/python/singa/layer.py ---------------------------------------------------------------------- diff --git a/tool/python/singa/layer.py b/tool/python/singa/layer.py index b391d26..491e98b 100644 --- a/tool/python/singa/layer.py +++ b/tool/python/singa/layer.py @@ -95,12 +95,12 @@ class Convolution2D(Layer): activation=None, **kwargs): ''' required - nb_filter = (int) // the number of filters - kernel = (int) // the size of filter + nb_filter = (int) // the number of filters + kernel = (int/tuple) // the size of filter optional - stride = (int) // the size of stride - pad = (int) // the size of padding - init = (string) // 'unirom', 'gaussian', 'constant' + stride = (int/tuple) // the size of stride + pad = (int/tuple) // the size of padding + init = (string) // 'uniform', 'gaussian', 'constant' w_param = (Parameter) // Parameter object for weight b_param = (Parameter) // Parameter object for bias **kwargs (KEY=VALUE) @@ -112,13 +112,29 @@ class Convolution2D(Layer): b_wd = (float) // weight decay multiplier for bias ''' - assert nb_filter > 0 and kernel > 0, 'should be set as positive int' + assert nb_filter > 0, 'nb_filter should be set as positive int' super(Convolution2D, self).__init__(name=generate_name('conv', 1), type=kCConvolution) - fields = {'num_filters' : nb_filter, - 'kernel' : kernel, - 'stride' : stride, - 'pad' : pad} + fields = {} + # for kernel + if type(kernel) == int: + fields['kernel'] = kernel + else: + fields['kernel_x'] = kernel[0] + fields['kernel_y'] = kernel[1] + # for stride + if type(stride) == int: + fields['stride'] = stride + else: + fields['stride_x'] = stride[0] + fields['stride_y'] = stride[1] + # for pad + if type(pad) == int: + fields['pad'] = pad + else: + fields['pad_x'] = pad[0] + fields['pad_y'] = pad[1] + setval(self.layer.convolution_conf, **fields) # parameter w @@ -158,7 +174,7 @@ class MaxPooling2D(Layer): if type(pool_size) == int: pool_size = (pool_size, pool_size) assert type(pool_size) == tuple and pool_size[0] == pool_size[1], \ - 'pool size should be square in Singa' + 'currently pool size should be square in Singa' super(MaxPooling2D, self).__init__(name=generate_name('pool'), type=kCPooling, **kwargs) fields = {'pool' : PoolingProto().MAX, @@ -184,7 +200,7 @@ class AvgPooling2D(Layer): if type(pool_size) == int: pool_size = (pool_size, pool_size) assert type(pool_size) == tuple and pool_size[0] == pool_size[1], \ - 'pool size should be square in Singa' + 'currently pool size should be square in Singa' super(AvgPooling2D, self).__init__(name=generate_name('pool'), type=kCPooling, **kwargs) self.layer.pooling_conf.pool = PoolingProto().AVG @@ -242,6 +258,16 @@ class Dropout(Layer): type=self.layer_type) self.layer.dropout_conf.dropout_ratio = ratio +class Accuracy(Layer): + + def __init__(self): + ''' + ''' + + self.name = 'accuracy' + self.layer_type = enumLayerType(self.name) + super(Accuracy, self).__init__(name=generate_name(self.name), + type=self.layer_type) class RGB(Layer): @@ -268,7 +294,7 @@ class Dense(Layer): output_dim = (int) optional activation = (string) - init = (string) // 'unirom', 'gaussian', 'constant' + init = (string) // 'uniform', 'gaussian', 'constant' w_param = (Parameter) // Parameter object for weight b_param = (Parameter) // Parameter object for bias **kwargs http://git-wip-us.apache.org/repos/asf/incubator-singa/blob/4662dc3e/tool/python/singa/model.py ---------------------------------------------------------------------- diff --git a/tool/python/singa/model.py b/tool/python/singa/model.py index 6ad9422..f652f86 100644 --- a/tool/python/singa/model.py +++ b/tool/python/singa/model.py @@ -55,6 +55,7 @@ class Model(object): self.result = None self.last_checkpoint_path = None self.cudnn = False + self.accuracy = False def add(self, layer): ''' @@ -151,6 +152,17 @@ class Model(object): else: getattr(lastly, 'srclayers').append(self.layers[0].layer.name) + if self.accuracy == True: + smly = net.layer.add() + smly.CopyFrom(Layer(name='softmax', type=kSoftmax).layer) + setval(smly, include=kTest) + getattr(smly, 'srclayers').append(self.layers[-1].layer.name) + aly = net.layer.add() + aly.CopyFrom(Accuracy().layer) + setval(aly, include=kTest) + getattr(aly, 'srclayers').append('softmax') + getattr(aly, 'srclayers').append(self.layers[0].layer.name) + # use of cudnn if self.cudnn == True: self.set_cudnn_layer_type(net) @@ -230,7 +242,8 @@ class Model(object): pass def evaluate(self, data=None, alg='bp', - checkpoint_path=None, execpath='', device=None, **fields): + checkpoint_path=None, execpath='', + device=None, show_acc=False, **fields): ''' required data = (Data) // Data class object for testing data @@ -239,6 +252,7 @@ class Model(object): checkpoint_path = (list) // checkpoint path execpaths = (string) // path to user's own executable device = (int/list) // a list of gpu ids + show_acc = (bool) // compute and the accuacy **fields (KEY=VALUE) batch_size = (int) // batch size for testing data test_freq = (int) // frequency of testing @@ -276,6 +290,9 @@ class Model(object): setval(self.jobconf, gpu=device) self.cudnn = True + # set True if showing the accuracy + self.accuracy = show_acc + self.build() # construct Nneuralnet Component #--- generate job.conf file for debug purpose
