Author: wangwei
Date: Mon Jul 20 05:50:40 2015
New Revision: 1691875
URL: http://svn.apache.org/r1691875
Log:
Restructure the layout of the website.
Add model configuration page.
TODO fill the content of the empty pages.
Added:
incubator/singa/site/trunk/content/markdown/docs/data.md
incubator/singa/site/trunk/content/markdown/docs/datashard.md
incubator/singa/site/trunk/content/markdown/docs/hdfs.md
incubator/singa/site/trunk/content/markdown/docs/layer.md
incubator/singa/site/trunk/content/markdown/docs/lmdb.md
incubator/singa/site/trunk/content/markdown/docs/model-config.md
- copied, changed from r1691866,
incubator/singa/site/trunk/content/markdown/docs/programming-model.md
incubator/singa/site/trunk/content/markdown/docs/neuralnet.md
incubator/singa/site/trunk/content/markdown/docs/program-model.md
incubator/singa/site/trunk/content/markdown/docs/rbm.md
incubator/singa/site/trunk/content/markdown/docs/rnn.md
incubator/singa/site/trunk/content/resources/images/dcnn-cifar10.png
(with props)
incubator/singa/site/trunk/content/resources/images/model-categorization.png
(with props)
incubator/singa/site/trunk/content/resources/images/unroll-rbm.png (with
props)
incubator/singa/site/trunk/content/resources/images/unroll-rnn.png (with
props)
Removed:
incubator/singa/site/trunk/content/markdown/docs/programming-model.md
Modified:
incubator/singa/site/trunk/content/site.xml
Added: incubator/singa/site/trunk/content/markdown/docs/data.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/data.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/datashard.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/datashard.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/hdfs.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/hdfs.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/layer.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/layer.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/lmdb.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/lmdb.md?rev=1691875&view=auto
==============================================================================
(empty)
Copied: incubator/singa/site/trunk/content/markdown/docs/model-config.md (from
r1691866, incubator/singa/site/trunk/content/markdown/docs/programming-model.md)
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/model-config.md?p2=incubator/singa/site/trunk/content/markdown/docs/model-config.md&p1=incubator/singa/site/trunk/content/markdown/docs/programming-model.md&r1=1691866&r2=1691875&rev=1691875&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/programming-model.md
(original)
+++ incubator/singa/site/trunk/content/markdown/docs/model-config.md Mon Jul 20
05:50:40 2015
@@ -1,67 +1,96 @@
## Model Configuration
-SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
of deep learning models.
-For each SGD iteration, there is a [Worker] computing gradients of parameters
from the NeuralNet and
-a [Updater] updating parameter values based on gradients. Hence the model
configuration mainly consists
-these three parts. We will introduce the NeuralNet, Worker and Updater in the
following paragraphs
-and describe the configurations for them.
-
-
-## NeuralNet
-
-### Deep learning training
-
-Deep learning is labeled as a feature learning technique, which usually
consists of multiple layers.
-Each layer is associated a feature transformation function. After going
through all layers,
-the raw input feature (e.g., pixels of images) would be converted into a
high-level feature that is
-easier for tasks like classification.
-
-Training a deep learning model is to find the optimal parameters involved in
the transformation functions
-that generates good features for specific tasks. The goodness of a set of
parameters is measured by
-a loss function, e.g., [Cross-Entropy Loss]. Since the loss functions are
usually non-linear and non-convex,
-it is difficult to get a closed form solution. Normally, people uses the SGD
algorithm which randomly
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models. For each SGD iteration, there is a
+[Worker](docs/architecture.html) computing
+gradients of parameters from the NeuralNet and a [Updater]() updating parameter
+values based on gradients. Hence the model configuration mainly consists these
+three parts. We will introduce the NeuralNet, Worker and Updater in the
+following paragraphs and describe the configurations for them. All model
+configuration is specified in the model.conf file in the user provided
+workspace folder. E.g., the [cifar10 example
folder](https://github.com/apache/incubator-singa/tree/master/examples/cifar10)
+has a model.conf file.
+
+
+### NeuralNet
+
+#### Deep learning training
+
+Deep learning is labeled as a feature learning technique, which usually
+consists of multiple layers. Each layer is associated a feature transformation
+function. After going through all layers, the raw input feature (e.g., pixels
+of images) would be converted into a high-level feature that is easier for
+tasks like classification.
+
+Training a deep learning model is to find the optimal parameters involved in
+the transformation functions that generates good features for specific tasks.
+The goodness of a set of parameters is measured by a loss function, e.g.,
+[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy). Since the
+loss functions are usually non-linear and non-convex, it is difficult to get a
+closed form solution. Normally, people uses the SGD algorithm which randomly
initializes the parameters and then iteratively update them to reduce the loss.
-### Uniform model representation
+#### Uniform model (neuralnet) representation
-Many deep learning models have being proposed. Figure 1 is a categorization of
popular deep learning models
-based on the layer connections. The NeuralNet abstraction of SINGA consists of
multiple directed
-connected layers. This abstraction is able to represent models from all the
three categorizations.
+<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
+Deep learning model categorization</img>
- *For the feed-forward models, their connections are already directed.
+Many deep learning models have being proposed. Fig. 1 is a categorization of
+popular deep learning models based on the layer connections. The
+[NeuralNet](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
+abstraction of SINGA consists of multiple directly connected layers. This
+abstraction is able to represent models from all the three categorizations.
+
+ * For the feed-forward models, their connections are already directed.
+
+ * For the RNN models, we unroll them into directed connections, as shown in
+ Fig. 2.
+
+ * For the undirected connections in RBM, DBM, etc., we replace each
undirected
+ connection with two directed connection, as shown in Fig. 3.
+
+<div style = "height: 200px">
+<div style = "float:left; text-align: center">
+<img src = "../images/unroll-rbm.png" style = "width: 280px"> <br/>Fig. 2:
Unroll RBM </img>
+</div>
+<div style = "float:left; text-align: center; margin-left: 40px">
+<img src = "../images/unroll-rnn.png" style = "width: 550px"> <br/>Fig. 3:
Unroll RNN </img>
+</div>
+</div>
- *For the RNN models, we unroll them into directed connections, as shown in
Figure 2.
-
- *For the undirected connections in RBM, DBM, etc., we replace each
undirected connection with two
- directed connection, as shown in Figure 3.
-
-In specific, the NeuralNet class is defined in [neuralnet.h] :
+In specific, the NeuralNet class is defined in
+[neuralnet.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
:
...
vector<Layer*> layers_;
...
-The Layer class is defined in [base_layer.h]:
+The Layer class is defined in
+[base_layer.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/base_layer.h):
vector<Layer*> srclayers_, dstlayers_;
LayerProto layer_proto_; // layer configuration, including meta info,
e.g., name
- ...
+ ...
-The connection with other layers are kept in the `srclayers_` and
`dstlayers_`.
-Since there are many different feature transformations, there are many
different [Layer implementations]
-correspondingly. For those layers which have parameters in their feature
transformation functions,
-they would have Param instances in the layer class, e.g.,
+The connection with other layers are kept in the `srclayers_` and `dstlayers_`.
+Since there are many different feature transformations, there are many
+different Layer implementations correspondingly. For layers that have
+parameters in their feature transformation functions, they would have Param
+instances in the layer class, e.g.,
Param weight;
-### Configure the structure of a NeuralNet instance
+#### Configure the structure of a NeuralNet instance
-To train a deep learning model, the first step is to write the configurations
for the
-model structure, i.e., the layers and connections for the NeuralNet. Like
Caffe, we use
-the [Google Protocol Buffer] to define the configuration schema, the NetProto
specifies the
-configuration fields for a NeuralNet instance,
+To train a deep learning model, the first step is to write the configurations
+for the model structure, i.e., the layers and connections for the NeuralNet.
+Like [Caffe](http://caffe.berkeleyvision.org/), we use the [Google Protocol
+Buffer](https://developers.google.com/protocol-buffers/) to define the
+configuration protocol. The
+[NetProto](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto)
+specifies the configuration fields for a NeuralNet instance,
message NetProto {
repeated LayerProto layer = 1;
@@ -121,30 +150,38 @@ A sample configuration for a feed-forwar
}
...
}
-
-The layer type list is defined in [model.proto]. One type (kFoo) corresponds
to one child
-class of Layer (FooLayer) and one configuration field (foo_conf). SINGA will
infer the dstlayers_
-of each layer after reading the configuration for all layers. Developers can
implement new layers
-and update the type list, then users can use the layer. [layer] describes the
configurations of
-current built-in layers.
-
-Figure 4 shows the model structure corresponding to the neural network
configuration in [cifar10/model.conf].
-
-
-## Worker
-
-At the beginning, the Work will initialize the values of Param instances of
each layer either randomly
-(according to user configured distribution) or load from a [checkpoint file].
-For each training iteration, the worker visits layers of the neural network to
compute gradients of
-Param instances of each layer. Corresponding to the three categories of
models, there are three
-different algorithm to compute the gradients of a neural network.
+
+The layer type list is defined in
+[LayerType](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto).
+One type (kFoo) corresponds to one child class of Layer (FooLayer) and one
+configuration field (foo_conf). SINGA will infer the dstlayers_ of each layer
+after reading the configuration for all layers. Developers can implement new
+layers and update the type list, then users can use the layer. [Layer
+abstraction]() describes the configurations of current built-in layers.
+
+Fig. 4 shows the model structure corresponding to the neural network
+configuration for the [deep convolutional
+model](https://github.com/apache/incubator-singa/blob/master/examples/cifar10/model.conf).
+
+<img src="../images/dcnn-cifar10.png" style = "width: 200px"> Fig. 4: Neural
+network structure for the example DCNN model (cifar10)</img>
+
+### Worker
+
+At the beginning, the Work will initialize the values of Param instances of
+each layer either randomly (according to user configured distribution) or
+loading from a [checkpoint file](). For each training iteration, the worker
+visits layers of the neural network to compute gradients of Param instances of
+each layer. Corresponding to the three categories of models, there are three
+different algorithm to compute the gradients of a neural network.
1. Back-propagation (BP) for feed-forward models
2. Back-propagation through time (BPTT) for recurrent neural networks
3. Contrastive divergence (CD) for RBM, DBM, etc models.
-SINGA has provided these three algorithms as three Worker implementations.
Users only need to configure
-in the model.conf file to specify which algorithm should be used. The
configuration protocol is
+SINGA has provided these three algorithms as three Worker implementations.
+Users only need to configure in the model.conf file to specify which algorithm
+should be used. The configuration protocol is
message ModelProto {
...
@@ -161,17 +198,20 @@ in the model.conf file to specify which
...
}
-These algorithms override the TrainOneBatch function of the Worker, e.g., the
BPWorker implement it as
+These algorithms override the TrainOneBatch function of the Worker. E.g., the
+BPWorker implements it as
void BPWorker::TrainOneBatch(int step, Metric* perf) {
Forward(step, kTrain, train_net_, perf);
Backward(step, train_net_);
}
-The Forward function pass the raw input features of one mini-batch through all
layers, and the Backward
-function visits the layers in reverse order to compute the gradients of the
loss w.r.t each layer's feature
-and each layer's Param objects. Different algorithms would visit the layers in
different orders. Some may
-traverses the neural network multiple times, e.g., the CDWorker's
TrainOneBatch function is:
+The Forward function passes the raw input features of one mini-batch through
+all layers, and the Backward function visits the layers in reverse order to
+compute the gradients of the loss w.r.t each layer's feature and each layer's
+Param objects. Different algorithms would visit the layers in different orders.
+Some may traverses the neural network multiple times, e.g., the CDWorker's
+TrainOneBatch function is:
void CDWorker::TrainOneBatch(int step, Metric* perf) {
PostivePhase(step, kTrain, train_net_, perf);
@@ -179,7 +219,8 @@ traverses the neural network multiple ti
GradientPhase(step, train_net_);
}
-But all algorithms will finally call the two functions of the Layer class:
+Each `*Phase` function would visit all layers one or multiple times.
+All algorithms will finally call two functions of the Layer class:
/**
* Transform features from connected layers into features of this layer.
@@ -194,19 +235,19 @@ But all algorithms will finally call the
*/
virtual void ComputeGradient(Phase phase) = 0;
-All Layer implementation must implement the above two functions.
+All [Layer implementations]() must implement the above two functions.
-## Updater
+### Updater
-Once the gradients of parameters are computed, the Updater will update
parameter values.
-There are many SGD variants for updating parameters,
-like [AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+Once the gradients of parameters are computed, the Updater will update
+parameter values. There are many SGD variants for updating parameters, like
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
[Nesterov](http://scholar.google.com/citations?view_op=view_citation&hl=en&user=DJ8Ep8YAAAAJ&citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C)
-and SGD with momentum. The core function of the Updater is
-
+and SGD with momentum. The core functions of the Updater is
+
/**
* Update parameter values based on gradients
* @param step training step
@@ -220,9 +261,9 @@ and SGD with momentum. The core function
*/
float GetLearningRate(int step);
-SINGA provides several built-in updaters and learning rate change methods,
users can configure them
-according the the [UpdaterProto]
-
+SINGA provides several built-in updaters and learning rate change methods.
+Users can configure them according to the UpdaterProto
+
message UpdaterProto {
enum UpdaterType{
// noraml SGD with momentum and weight decay
@@ -250,19 +291,20 @@ according the the [UpdaterProto]
}
// change method for learning rate
required ChangeMethod lr_change= 2 [default = kFixed];
-
+
optional FixedStepProto fixedstep_conf=40;
- ...
+ ...
optional float momentum = 31 [default = 0];
optional float weight_decay = 32 [default = 0];
// base learning rate
- optional float base_lr = 34 [default = 0];
+ optional float base_lr = 34 [default = 0];
}
-## Other model configuration fields
+### Other model configuration fields
-Some other important configuration fields for training a deep learning model
is listed:
+Some other important configuration fields for training a deep learning model is
+listed:
// model name, e.g., "cifar10-dcnn", "mnist-mlp"
required string name = 1;
@@ -276,4 +318,5 @@ Some other important configuration field
// checkpoint path
optional bool resume = 36 [default = false];
-The pages of [checkpoint and restore], [validation and test] have more details
on related fields.
+The pages of [checkpoint and restore](), [validation and test]() have more
details
+on related fields.
Added: incubator/singa/site/trunk/content/markdown/docs/neuralnet.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/neuralnet.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/program-model.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/program-model.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/rbm.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/rbm.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/markdown/docs/rnn.md
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/rnn.md?rev=1691875&view=auto
==============================================================================
(empty)
Added: incubator/singa/site/trunk/content/resources/images/dcnn-cifar10.png
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/resources/images/dcnn-cifar10.png?rev=1691875&view=auto
==============================================================================
Binary file - no diff available.
Propchange: incubator/singa/site/trunk/content/resources/images/dcnn-cifar10.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added:
incubator/singa/site/trunk/content/resources/images/model-categorization.png
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/resources/images/model-categorization.png?rev=1691875&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
incubator/singa/site/trunk/content/resources/images/model-categorization.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: incubator/singa/site/trunk/content/resources/images/unroll-rbm.png
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/resources/images/unroll-rbm.png?rev=1691875&view=auto
==============================================================================
Binary file - no diff available.
Propchange: incubator/singa/site/trunk/content/resources/images/unroll-rbm.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: incubator/singa/site/trunk/content/resources/images/unroll-rnn.png
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/resources/images/unroll-rnn.png?rev=1691875&view=auto
==============================================================================
Binary file - no diff available.
Propchange: incubator/singa/site/trunk/content/resources/images/unroll-rnn.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Modified: incubator/singa/site/trunk/content/site.xml
URL:
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/site.xml?rev=1691875&r1=1691874&r2=1691875&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/site.xml (original)
+++ incubator/singa/site/trunk/content/site.xml Mon Jul 20 05:50:40 2015
@@ -30,7 +30,7 @@
<publishDate position="none"/>
<version position="none"/>
-
+
<poweredBy>
<logo name="apache-incubator" alt="Apache Incubator"
img="http://incubator.apache.org/images/egg-logo.png"
href="http://incubator.apache.org"/>
</poweredBy>
@@ -40,7 +40,7 @@
<artifactId>maven-fluido-skin</artifactId>
<version>1.4</version>
</skin>
-
+
<body>
<breadcrumbs position="left">
@@ -55,13 +55,23 @@
<menu name="Documentaion">
<item name="Installation" href="docs/installation.html"/>
+ <item name="Programming Model" href="docs/program-model.html">
+ <item name ="Model Configuration" href="docs/model-config.html"/>
+ <item name="Neural Network" href="docs/neuralnet.html"/>
+ <item name="Layer" href="docs/layer.html"/>
+ </item>
+ <item name = "Data Preparation" href = "docs/data.html">
+ <item name = "DataShard" href = "docs/datashard.html"/>
+ <item name = "LMDB" href = "docs/lmdb.html"/>
+ <item name = "HDFS" href = "docs/hdfs.html"/>
+ </item>
<item name="System Architecture" href="docs/architecture.html"/>
<item name="Communication" href="docs/communication.html"/>
- <item name="Neural Network Partition"
href="docs/neuralnet-partition.html"/>
- <item name="Programming Model" href="docs/programming-model.html"/>
<item name="Examples" href="docs/examples.html">
<item name="MLP" href="docs/mlp.html"/>
<item name="CNN" href="docs/cnn.html"/>
+ <item name = "RBM" href="docs/rbm.html"/>
+ <item name = "RNN" href="docs/rnn.html"/>
</item>
</menu>