Author: jinyang
Date: Wed Sep  2 07:59:20 2015
New Revision: 1700722

URL: http://svn.apache.org/r1700722
Log:
move to apache site

Added:
    incubator/singa/site/trunk/content/markdown/docs/data.md
    incubator/singa/site/trunk/content/markdown/docs/neural-net.md
    incubator/singa/site/trunk/content/markdown/docs/overview.md
    incubator/singa/site/trunk/content/markdown/docs/param.md
    incubator/singa/site/trunk/content/markdown/docs/programming-guide.md
    incubator/singa/site/trunk/content/markdown/docs/quick-start.md
    incubator/singa/site/trunk/content/markdown/docs/train-one-batch.md
    incubator/singa/site/trunk/content/markdown/docs/updater.md
Modified:
    incubator/singa/site/trunk/content/markdown/docs/architecture.md
    incubator/singa/site/trunk/content/markdown/docs/checkpoint.md
    incubator/singa/site/trunk/content/markdown/docs/cnn.md
    incubator/singa/site/trunk/content/markdown/docs/communication.md
    incubator/singa/site/trunk/content/markdown/docs/distributed-training.md
    incubator/singa/site/trunk/content/markdown/docs/frameworks.md
    incubator/singa/site/trunk/content/markdown/docs/installation.md
    incubator/singa/site/trunk/content/markdown/docs/layer.md
    incubator/singa/site/trunk/content/markdown/docs/mlp.md
    incubator/singa/site/trunk/content/markdown/docs/rbm.md
    incubator/singa/site/trunk/content/markdown/docs/rnn.md

Modified: incubator/singa/site/trunk/content/markdown/docs/architecture.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/architecture.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/architecture.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/architecture.md Wed Sep  2 
07:59:20 2015
@@ -1,14 +1,19 @@
-## System Architecture
+---
+layout: post
+title:  Architecture
+category : docs
+tags : [architecture]
+---
+{% include JB/setup %}
 
-___
 
 ### Logical Architecture
 
-<img src="../images/distributed/logical.png" style="width: 550px"/>
+<img src="http://singa.incubator.apache.org/assets/image/logical.png"; 
style="width: 550px"/>
 <p><strong> Fig.1 - Logical system architecture</strong></p>
 
 SINGA has flexible architecture to support different distributed
-[training frameworks](frameworks.html) (both synchronous and asynchronous).
+[training frameworks](http://singa.incubator.apache.org/docs/frameworks.html) 
(both synchronous and asynchronous).
 The logical system architecture is shown in Fig.1.
 The architecture consists of multiple server groups and worker groups:
 
@@ -16,21 +21,21 @@ The architecture consists of multiple se
   A server group maintains a complete replica of the model parameters,
   and is responsible for handling get/update requests from worker groups.
   Neighboring server groups synchronize their parameters periodically.
-  Typically, a server group contains a number of servers, 
+  Typically, a server group contains a number of servers,
   and each server manages a partition of model parameters.
 * **Worker group**
   Each worker group communicates with only one server group.
   A worker group trains a complete model replica
-  against a partition of the training dataset, 
-  and is responsible for computing parameter gradients. 
+  against a partition of the training dataset,
+  and is responsible for computing parameter gradients.
   All worker groups run and communicate with the corresponding
   server groups asynchronously.
   However, inside each worker group,
   the workers synchronously compute parameter updates for the model replica.
 
 There are different strategies to distribute the training workload among 
workers
-within a group: 
-  
+within a group:
+
   * **Model parallelism**. Each worker computes a subset of parameters
   against all data partitioned to the group.
   * **Data parallelism**. Each worker computes all parameters
@@ -40,7 +45,7 @@ within a group:
 
 ### Implementation
 In SINGA, servers and workers are execution units running in separate threads.
-They communicate through [messages](communication.html).
+They communicate through 
[messages](http://singa.incubator.apache.org/docs/communication.html).
 Every process runs the main thread as a stub that aggregates local messages
 and forwards them to corresponding (remote) receivers.
 
@@ -50,5 +55,5 @@ resident in the same process, their *Par
 be configured to share the same memory space. In this case, the
 messages transferred between different execution units just contain
 pointers to the data, which reduces the communication cost.
-Unlike in inter-process cases, 
+Unlike in inter-process cases,
 the messages have to include the parameter values.

Modified: incubator/singa/site/trunk/content/markdown/docs/checkpoint.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/checkpoint.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/checkpoint.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/checkpoint.md Wed Sep  2 
07:59:20 2015
@@ -1,89 +1,88 @@
-## Checkpoint and Resume
+---
+layout: post
+title: Checkpoint and Resume
+category : docs
+tags : [checkpoint, restore]
+---
+{% include JB/setup %}
+
+SINGA checkpoints model parameters onto disk periodically according to user
+configured frequency. By checkpointing model parameters, we can
+
+  1. resume the training from the last checkpointing. For example, if
+    the program crashes before finishing all training steps, we can continue
+    the training using checkpoint files.
 
-___
-
-### Applications of checkpoint
-
-By taking checkpoints of model parameters, we can
-
-  1. Restore (resume) the training from the last checkpoint. For example, if
-    the program crashes before finishing all training steps.
-
-  2. Use them as pre-training results for a similar model. For example, the
+  2. use them to initialize a similar model. For example, the
     parameters from training a RBM model can be used to initialize
-    a [deep auto-encoder](auto-encoder.html) model.
+    a [deep auto-encoder](http://singa.incubator.apache.org/docs/rbm) model.
 
+## Configuration
 
-### Instructions for checkpoint and resume
+Checkpointing is controlled by two configuration fields:
 
-Checkpoint is controlled by two model configuration fields:
-`checkpoint_after` (start checkpoint after this number of training steps)
-and `checkpoint_frequency`. The checkpoint files are located
-at `WORKSPACE/checkpoint/stepSTEP-workerWORKERID.bin`.
+* `checkpoint_after`, start checkpointing after this number of training steps,
+* `checkpoint_freq`, frequency of doing checkpointing.
 
-The following configuration shows an example,
+For example,
 
-    model {
-      ...
-      checkpoint_after: 100
-      checkpoint_frequency: 300
-      ...
-    }
+    # job.conf
+    workspace: "WORKSPACE"
+    checkpoint_after: 100
+    checkpoint_frequency: 300
+    ...
 
-After training for 700 steps, under WORKSPACE/checkpoint folder, there would be
-two checkpoint files (training on single node):
+Checkpointing files are located at 
*WORKSPACE/checkpoint/stepSTEP-workerWORKERID.bin*.
+For the above configuration, after training for 700 steps, there would be
+two checkpointing files,
 
     step400-worker0.bin
     step700-worker0.bin
 
-#### Application 1
-We can resume the training from the last checkpoint (i.e., step 700) by:
+## Application - resuming training
 
-    ./bin/singa-run.sh -workspace=WORKSPACE --resume
+We can resume the training from the last checkpoint (i.e., step 700) by,
 
-#### Application 2
+    ./bin/singa-run.sh -conf JOB_CONF -resume
 
-We can also use the checkpoint file from step 400 as the
-pre-trained model for a new model by configuring the
-job.conf of the new model as:
+There is no change to the job configuration.
 
-    model {
-      ...
-      checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
-      ...
-    }
+## Application - model initialization
 
-If there are multiple checkpoint files for the same snapshot due to model
-partitioning, all the checkpoint files should be added:
+We can also use the checkpointing file from step 400 to initialize
+a new model by configuring the new job as,
 
-    model {
-      ...
-      checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
-      checkpoint : "WORKSPACE/checkpoint/step400-worker1.bin"
-      ...
-    }
+    # job.conf
+    checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
+    ...
 
+If there are multiple checkpointing files for the same snapshot due to model
+partitioning, all the checkpointing files should be added,
 
-The launching command is the same as starting a new job
+    # job.conf
+    checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
+    checkpoint : "WORKSPACE/checkpoint/step400-worker1.bin"
+    ...
 
-    ./bin/singa-run.sh -workspace=WORKSPACE
+The training command is the same as starting a new job,
 
+    ./bin/singa-run.sh -conf JOB_CONF
 
-### Implementation details
+{% comment %}
+## Advanced user guide
 
-The checkpoint is done in the Worker class and controlled by two model
-configuration fields: `checkpoint_after` and `checkpoint_frequency`.
-Only Params owning the param values from the first group are dumped onto into
-checkpoint files. For one Param object, its name, version and values are saved.
+Checkpointing is done in the [Worker 
class](http://singa.incubator.apache.org/api/classsinga_1_1Worker.html).
+Only `Param`s from the first group are dumped into
+checkpointing files. For a `Param` object, its name, version and values are 
saved.
 It is possible that the snapshot is separated
 into multiple files because the neural net is partitioned into multiple 
workers.
 
-The Worker's InitLocalParam will initialize Params from checkpoint files if the
+The Worker's `InitLocalParam` function will initialize parameters from 
checkpointing files if the
 `checkpoint` field is set. Otherwise it randomly initialize them using user
-configured initialization method.  The Param objects are matched based on name.
-If the Param is not configured with a name, NeuralNet class will automatically
-create one for it based on the name of the layer to which the Param object 
belongs.
-The `checkpoint` can be set by users (Application 1) or by the Resume function
+configured initialization method. The `Param` objects are matched based on 
name.
+If a `Param` object is not configured with a name, `NeuralNet` class will 
automatically
+create one for it based on the name of the layer.
+The `checkpoint` can be set by users (Application 1) or by the `Resume` 
function
 (Application 2) of the Trainer class, which finds the files for the latest
 snapshot and add them to the `checkpoint` filed. It also sets the `step` field
 of model configuration to the checkpoint step (extracted from file name).
@@ -91,8 +90,10 @@ of model configuration to the checkpoint
 
 ### Caution
 
-Both two applications must be taken carefully when Param objects are
+Both two applications must be taken carefully when `Param` objects are
 partitioned due to model partitioning. Because if the training is done using 2
 workers, while the new model (or continue training) is trained with 3 workers,
-then the same original Param object is partitioned in different ways and hence
+then the same original `Param` object is partitioned in different ways and 
hence
 cannot be matched.
+
+{% endcomment %}

Modified: incubator/singa/site/trunk/content/markdown/docs/cnn.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/cnn.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/cnn.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/cnn.md Wed Sep  2 07:59:20 
2015
@@ -1,65 +1,264 @@
-Title:
-Notice:    Licensed to the Apache Software Foundation (ASF) under one
-           or more contributor license agreements.  See the NOTICE file
-           distributed with this work for additional information
-           regarding copyright ownership.  The ASF licenses this file
-           to you under the Apache License, Version 2.0 (the
-           "License"); you may not use this file except in compliance
-           with the License.  You may obtain a copy of the License at
-           .
-             http://www.apache.org/licenses/LICENSE-2.0
-           .
-           Unless required by applicable law or agreed to in writing,
-           software distributed under the License is distributed on an
-           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-           KIND, either express or implied.  See the License for the
-           specific language governing permissions and limitations
-           under the License.
-
-This example will show you how to use SINGA to train a CNN model using cifar10 
dataset.
-
-### Prepare for the data
-* First go to the `example/cifar10/` folder for preparing the dataset. There 
should be a makefile example called Makefile.example in the folder. Run the 
command `cp Makefile.example Makefile` to generate the makefile.
-Then run the command `make download` and `make create`  in the current folder 
to download cifar10 dataset and prepare for the training and testing datashard. 
-
-### Set job configuration.
-* If you just want to use the training model provided in this example, you can 
just use job.conf file in current directory. Fig. 1 gives an example of CNN 
struture. In this example, we define a CNN model that contains 3 
convolution+relu+maxpooling+normalization layers. 
-If you want to learn more about how it is configured, you can go to [Model 
Configuration](http://singa.incubator.apache.org/docs/model-config.html) to get 
details. 
+---
+layout: post
+title:  Example --- Convolution Neural Network
+category : docs
+tags : [cnn, example]
+---
+{% include JB/setup %}
+
+
+Convolutional neural network (CNN) is a type of feed-forward artificial neural
+network widely used for image and video classification. In this example, we 
will
+use a deep CNN model to do image classification for the
+[CIFAR10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html).
+
+
+## Running instructions
+
+Please refer to the 
[installation](http://singa.incubator.apache.org/docs/installation) page for
+instructions on building SINGA, and the [quick 
start](http://singa.incubator.apache.org/docs/quick-start)
+for instructions on starting zookeeper.
+
+We have provided scripts for preparing the training and test dataset in 
*examples/cifar10/*.
+
+    # in examples/cifar10
+    $ cp Makefile.example Makefile
+    $ make download
+    $ make create
+
+
+After the datasets are prepared, we start the training by
+
+    ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+After it is started, you should see output like
+
+    Record job information to /tmp/singa-log/job-info/job-2-20150817-055601
+    Executing : ./singa -conf /xxx/incubator-singa/examples/cifar10/job.conf 
-singa_conf /xxx/incubator-singa/conf/singa.conf -singa_job 2
+    E0817 06:56:18.868259 33849 cluster.cc:51] proc #0 -> 192.168.5.128:49152 
(pid = 33849)
+    E0817 06:56:18.928452 33871 server.cc:36] Server (group = 0, id = 0) start
+    E0817 06:56:18.928469 33872 worker.cc:134] Worker (group = 0, id = 0) start
+    E0817 06:57:13.657302 33849 trainer.cc:373] Test step-0, loss : 2.302588, 
accuracy : 0.077900
+    E0817 06:57:17.626708 33849 trainer.cc:373] Train step-0, loss : 2.302578, 
accuracy : 0.062500
+    E0817 06:57:24.142645 33849 trainer.cc:373] Train step-30, loss : 
2.302404, accuracy : 0.131250
+    E0817 06:57:30.813354 33849 trainer.cc:373] Train step-60, loss : 
2.302248, accuracy : 0.156250
+    E0817 06:57:37.556655 33849 trainer.cc:373] Train step-90, loss : 
2.301849, accuracy : 0.175000
+    E0817 06:57:44.971276 33849 trainer.cc:373] Train step-120, loss : 
2.301077, accuracy : 0.137500
+    E0817 06:57:51.801949 33849 trainer.cc:373] Train step-150, loss : 
2.300410, accuracy : 0.135417
+    E0817 06:57:58.682281 33849 trainer.cc:373] Train step-180, loss : 
2.300067, accuracy : 0.127083
+    E0817 06:58:05.578366 33849 trainer.cc:373] Train step-210, loss : 
2.300143, accuracy : 0.154167
+    E0817 06:58:12.518497 33849 trainer.cc:373] Train step-240, loss : 
2.295912, accuracy : 0.185417
+
+After the training of some steps (depends on the setting) or the job is
+finished, SINGA will 
[checkpoint](http://singa.incubator.apache.org/docs/checkpoint) the model 
parameters.
+
+## Details
+
+To train a model in SINGA, you need to prepare the datasets,
+and a job configuration which specifies the neural net structure, training
+algorithm (BP or CD), SGD update algorithm (e.g. Adagrad),
+number of training/test steps, etc.
+
+### Data preparation
+Before using SINGA, you need to write a program to pre-process the dataset you
+use to a format that SINGA can read. Please refer to the
+[Data 
Preparation](http://singa.incubator.apache.org/docs/data#example---cifar-dataset)
 to get details about preparing this CIFAR10 dataset.
+
+### Neural net
+
+Figure 1 shows the net structure of the CNN model we used in this example, 
which is
+set following [this 
page](https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-18pct.cfg.)
+The dashed circle represents one feature transformation stage, which generally
+has four layers as shown in the figure. Sometimes the rectifier layer and 
normalization layer
+is omitted or swapped in one stage. For this example, there are 3 such stages.
+
+Next we follow the guide in [neural net 
page](http://singa.incubator.apache.org/docs/neural-net)
+and [layer page](http://singa.incubator.apache.org/docs/layer) to write the 
neural net configuration.
 
 <div style = "text-align: center">
-<img src = "../images/dcnn-cifar10.png" style = "width: 280px"> <br/>Fig. 1: 
CNN example </img>
+<img src = "http://singa.incubator.apache.org/assets/image/cnn-example.png"; 
style = "width: 200px"> <br/>
+<strong>Figure 1 - Net structure of the CNN example.</strong></img>
 </div>
 
+* We configure a [data 
layer](http://singa.incubator.apache.org/docs/layer#data-layers) to read
+the training/testing `Records` from `DataShard`.
 
-### Run SINGA
-* All script of SINGA should be run in the root folder of SINGA.
-First you need to start the zookeeper service if zookeeper is not started. The 
command is `./bin/zk-service start`. 
-Then you can run the command `./bin/singa-run.sh -conf 
examples/cifar10/job.conf` to start a SINGA job using examples/cifar10/job.conf 
as the job configuration.
-After it is started, you should get a screenshots like the following:
-
-
-        xxx@yyy:zzz/incubator-singa$ ./bin/singa-run.sh -conf 
examples/cifar10/job.conf
-        Unique JOB_ID is 2
-        Record job information to /tmp/singa-log/job-info/job-2-20150817-055601
-        Executing : ./singa -conf 
/xxx/incubator-singa/examples/cifar10/job.conf -singa_conf 
/xxx/incubator-singa/conf/singa.conf -singa_job 2
-        E0817 06:56:18.868259 33849 cluster.cc:51] proc #0 -> 
192.168.5.128:49152 (pid = 33849)
-        E0817 06:56:18.928452 33871 server.cc:36] Server (group = 0, id = 0) 
start
-        E0817 06:56:18.928469 33872 worker.cc:134] Worker (group = 0, id = 0) 
start
-        E0817 06:57:13.657302 33849 trainer.cc:373] Test step-0, loss : 
2.302588, accuracy : 0.077900
-        E0817 06:57:17.626708 33849 trainer.cc:373] Train step-0, loss : 
2.302578, accuracy : 0.062500
-        E0817 06:57:24.142645 33849 trainer.cc:373] Train step-30, loss : 
2.302404, accuracy : 0.131250
-        E0817 06:57:30.813354 33849 trainer.cc:373] Train step-60, loss : 
2.302248, accuracy : 0.156250
-        E0817 06:57:37.556655 33849 trainer.cc:373] Train step-90, loss : 
2.301849, accuracy : 0.175000
-        E0817 06:57:44.971276 33849 trainer.cc:373] Train step-120, loss : 
2.301077, accuracy : 0.137500
-        E0817 06:57:51.801949 33849 trainer.cc:373] Train step-150, loss : 
2.300410, accuracy : 0.135417
-        E0817 06:57:58.682281 33849 trainer.cc:373] Train step-180, loss : 
2.300067, accuracy : 0.127083
-        E0817 06:58:05.578366 33849 trainer.cc:373] Train step-210, loss : 
2.300143, accuracy : 0.154167
-        E0817 06:58:12.518497 33849 trainer.cc:373] Train step-240, loss : 
2.295912, accuracy : 0.185417
-
-
-After the training of some steps (depends on the setting) or the job is 
finished, SINGA will checkpoint the current parameter. In the next time, you 
can train (or use for your application) by loading the checkpoint. Please refer 
to [Checkpoint](http://singa.incubator.apache.org/docs/checkpoint.html) for the 
use of checkpoint.
-
-### Build your own model
-* If you want to specify you own model, then you need to decribe  it in the 
job.conf file. It should contain the neurualnet structure, training 
algorithm(backforward or contrastive divergence etc.), SGD update 
algorithm(e.g. Adagrad), number of training/test steps and training/test 
frequency, and display features and etc. SINGA will read job.conf as a Google 
protobuf class [JobProto](../src/proto/job.proto). You can also refer to the 
[Programmer 
Guide](http://singa.incubator.apache.org/docs/programmer-guide.html) to get 
details. 
+        layer{
+            name: "data"
+            type: kShardData
+            sharddata_conf {
+              path: "examples/cifar10/cifar10_train_shard"
+              batchsize: 16
+              random_skip: 5000
+            }
+            exclude: kTest  # exclude this layer for the testing net
+          }
+        layer{
+            name: "data"
+            type: kShardData
+            sharddata_conf {
+              path: "examples/cifar10/cifar10_test_shard"
+              batchsize: 100
+            }
+            exclude: kTrain # exclude this layer for the training net
+          }
+
+* We configure two [parser 
layers](http://singa.incubator.apache.org/docs/layer#parser-layers)
+to extract the image feature and label from `Records`s loaded by the *data* 
layer.
+
+        layer{
+            name:"rgb"
+            type: kRGBImage
+            srclayers: "data"
+            rgbimage_conf {
+              meanfile: "examples/cifar10/image_mean.bin" # normalize image 
feature
+            }
+          }
+        layer{
+            name: "label"
+            type: kLabel
+            srclayers: "data"
+          }
+
+
+* We configure layers for the feature transformation as follows
+(all layers are built-in layers in SINGA; hyper-parameters of these layers are 
set according to
+[Alex's 
setting](https://code.google.com/p/cuda-convnet/source/browse/trunk/example-layers/layers-18pct.cfg)).
+
+        layer {
+            name: "conv1"
+            type: kConvolution
+            srclayers: "rgb"
+            convolution_conf {
+              num_filters: 32
+              kernel: 5
+              stride: 1
+              pad:2
+            }
+            param {
+              name: "w1"
+              init {
+                type:kGaussian
+                std:0.0001
+              }
+            }
+            param {
+              name: "b1"
+              lr_scale:2.0
+              init {
+                type: kConstant
+                value:0
+              }
+            }
+          }
+
+          layer {
+            name: "pool1"
+            type: kPooling
+            srclayers: "conv1"
+            pooling_conf {
+              pool: MAX
+              kernel: 3
+              stride: 2
+            }
+          }
+          layer {
+            name: "relu1"
+            type: kReLU
+            srclayers:"pool1"
+          }
+          layer {
+            name: "norm1"
+            type: kLRN
+            lrn_conf {
+              local_size: 3
+              alpha: 5e-05
+              beta: 0.75
+            }
+            srclayers:"relu1"
+          }
+
+  The configurations for another 2 stages are omitted here.
+
+* There is a [inner product 
layer](http://singa.incubator.apache.org/docs/layer#innerproductlayer)
+after the 3 transformation stages, which is
+configured with 10 output units, i.e., the number of total labels. The weight
+matrix param is configured with a large weight decay scale to reduce the 
over-fitting.
+
+        layer {
+            name: "ip1"
+            type: kInnerProduct
+            srclayers:"pool3"
+            innerproduct_conf {
+              num_output: 10
+            }
+            param {
+              name: "w4"
+              wd_scale:250
+              init {
+                type:kGaussian
+                std:0.01
+              }
+            }
+            param {
+              name: "b4"
+              lr_scale:2.0
+              wd_scale:0
+              init {
+                type: kConstant
+                value:0
+              }
+            }
+          }
+
+* The last layer is a [Softmax loss 
layer](http://singa.incubator.apache.org/docs/layer#softmaxloss)
+
+          layer{
+            name: "loss"
+            type: kSoftmaxLoss
+            softmaxloss_conf{
+              topk:1
+            }
+            srclayers:"ip1"
+            srclayers: "label"
+          }
+
+
+### Updater
+The [normal SGD 
updater](http://singa.incubator.apache.org/docs/updater#updater) is selected.
+The learning rate is changed like stairs, and is configured using the
+[kFixedStep](http://singa.incubator.apache.org/docs/updater#kfixedstep) type.
+
+    updater{
+      type: kSGD
+      weight_decay:0.004
+      learning_rate {
+        type: kFixedStep
+        fixedstep_conf:{
+          step:0             # lr for step 0-60000 is 0.001
+          step:60000         # lr for step 60000-65000 is 0.0001
+          step:65000         # lr for step 650000- is 0.00001
+          step_lr:0.001
+          step_lr:0.0001
+          step_lr:0.00001
+        }
+      }
+    }
+
+### TrainOneBatch algorithm
+The CNN model is a feed forward model, thus should be configured to use the
+[Back-propagation algorithm]({{ 
BASE_PATH}}/docs/train-one-batch#back-propagation).
+
+    alg: kBP
+
+### Cluster setting
+The following configuration set a single worker and server for training.
+[Training frameworks](http://singa.incubator.apache.org/docs/frameworks) page 
introduces configurations of a couple of distributed
+training frameworks.
+
+    cluster {
+      nworker_groups: 1
+      nserver_groups: 1
+    }
 
 

Modified: incubator/singa/site/trunk/content/markdown/docs/communication.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/communication.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/communication.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/communication.md Wed Sep  
2 07:59:20 2015
@@ -1,6 +1,10 @@
-## Communication
-
-___
+---
+layout: post
+title:  Communication
+category : docs
+tags : [rnn, example]
+---
+{% include JB/setup %}
 
 Different messaging libraries has different benefits and drawbacks. For 
instance,
 MPI provides fast message passing between GPUs (using GPUDirect), but does not
@@ -22,7 +26,7 @@ example architecture.
 <p><strong> Fig.1 - Example physical architecture and network 
connection</strong></p>
 
 Fig.1 shows an example physical architecture and its network connection.
-[Section-partition server side ParamShard](architecture.html}) has a detailed 
description of the
+[Section-partition server side 
ParamShard](http://singa.incubator.apache.org/docs/architecture.html}) has a 
detailed description of the
 architecture. Each process consists of one main thread running the stub and 
multiple
 background threads running the worker and server tasks. The stub of the main
 thread forwards messages among threads . The worker and
@@ -142,12 +146,12 @@ The API for the base Msg is:
        *  Returns size of the parsed content.
        */
       int ParseFormatFrame(const char* format, ...);
-    
+
     #ifdef USE_ZMQ
       void ParseFromZmsg(zmsg_t* msg);
       zmsg_t* DumpToZmsg();
     #endif
-    
+
       /**
        * @return msg size in terms of bytes, ignore meta info.
        */
@@ -194,11 +198,11 @@ The API for the base Msg is:
       int trgt_version() const {
         return trgt_version_;
       }
-    
+
     };
 
-In order for a Msg object to be routed, the source and dest address should be 
attached. 
-This is achieved by calling the set_src and set_dst methods of the Msg object. 
+In order for a Msg object to be routed, the source and dest address should be 
attached.
+This is achieved by calling the set_src and set_dst methods of the Msg object.
 The address parameter passed to these two methods can be manipulated via a set 
of
 helper functions, shown as below.
 
@@ -211,7 +215,7 @@ helper functions, shown as below.
     inline int Addr(int grp, int id_or_proc, int type) {
       return (grp << 16) | (id_or_proc << 8) | type;
     }
-    
+
     /**
      * Parse group id from addr.
      *
@@ -229,7 +233,7 @@ helper functions, shown as below.
       static const int mask = (1 << 8) - 1;
       return (addr >> 8) & mask;
     }
-    
+
     /**
      * Parse worker/server procs from addr.
      *
@@ -287,8 +291,8 @@ are:
       virtual void* InternalID() const = 0;
     };
 
-A poller class is provided to enable asynchronous communication between 
routers and dealers. 
-One can register a set of SocketInterface objects with a poller instance via 
calling its Add method, and 
+A poller class is provided to enable asynchronous communication between 
routers and dealers.
+One can register a set of SocketInterface objects with a poller instance via 
calling its Add method, and
 then call the Wait method of this poll object to wait for the registered 
SocketInterface objects to be ready
 for sending and receiving messages. The APIs of the poller class is shown 
below.
 
@@ -308,7 +312,7 @@ for sending and receiving messages. The
         * queue; nullptr if no message in any sockets,
         */
       SocketInterface* Wait(int duration);
-    
+
       /**
        * @return true if the poller is terminated due to process interupt
        */
@@ -352,7 +356,7 @@ by connecting the Dealer socket to the e
 #### Router Socket
 
 The Router socket inherits from the base Socket. One Router socket connects to
-at least one Dealer socket. Upon receiving a message, the router forwards it 
to 
+at least one Dealer socket. Upon receiving a message, the router forwards it to
 the appropriate dealer according to the receiver's ID of this message.
 
     class Router : public SocketInterface {
@@ -385,7 +389,7 @@ the appropriate dealer according to the
       int Send(Msg** msg) override;
       Msg* Receive() override;
       void* InternalID() const override;
-    
+
     };
 
 ### Implementation
@@ -452,7 +456,7 @@ where every process has a server group a
 Hence, we can implement DeepImage in Singa by simply using MPI's AllReduce 
function for
 inter-process communication.
 
-<!--
+{% comment %}
 #### Server socket
 
 Each server has a DEALER socket to communicate with the stub in the main
@@ -461,8 +465,7 @@ other servers, and forwarded by the ROUT
 stub, we can make the location of workers transparent to server threads. The
 stub records the locations of workers and servers.
 
-As explained previously in the
-[APIs]({{ BASE_PATH }}{% post_url /docs/2015-03-20-parameter-management %})
+As explained previously in the [APIs](http://singa.incubator.apache.org{% 
post_url /docs/2015-03-20-parameter-management %})
 for parameter management, some requests may
 not be processed immediately but have to be re-queued. For instance, the Get
 request cannot be processed if the requested parameter is not available, i.e.,
@@ -526,5 +529,4 @@ All messages in SINGA are of multi-frame
   4. Requests originating from another server and arriving at the server have 
the same format as (3), but the first frame identifies the server connection 
(or Server ID).
   5. After a PMServer processes a request, it generates a message with the 
format similar to (3) but with extra frame indicating if the message is to be 
routed back to a worker (a response message) or to route to another server (a 
SYNC request).
   6. When a request is re-queued, the PMServer generates a message and sends 
it directly to the server's front-end socket. The re-queued request seen by the 
server's main thread consists of all the frames in (3), followed by a REQUEUED 
frame, and finally by another frame generated by the ROUTER socket identifying 
connection from the PMServer instance. The main thread then strips off these 
additional two frames before  forwarding it to another PMServer instance like 
another ordinary request.
-
--->
\ No newline at end of file
+{% endcomment %}

Added: incubator/singa/site/trunk/content/markdown/docs/data.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/data.md?rev=1700722&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/data.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/data.md Wed Sep  2 
07:59:20 2015
@@ -0,0 +1,235 @@
+---
+layout: post
+title:  Data Preparation
+category : docs
+tags : [data]
+---
+{% include JB/setup %}
+
+To submit a training job, users need to convert raw data (e.g., images, text
+documents) into SINGA recognizable [Record](api/classsinga_1_1Record.html)s.
+SINGA uses [data 
layers](http://singa.incubator.apache.org/docs/layer#data-layers)
+to load these records into memory and uses
+[parser layers](http://singa.incubator.apache.org/docs/layer#parser-layers) to 
parse features (e.g.,
+image pixels and labels) from these `Record`s. `Record`s could be
+stored in a file, a database, or HDFS, as
+long as there is a corresponding
+[DataLayer](http://singa.incubator.apache.org/api/classsinga_1_1DataLayer.html).
+
+## DataShard
+
+SINGA comes with a light-weight database named 
[DataShard](http://singa.incubator.apache.org/api/classsinga_1_1DataShard.html).
+It provides operations for inserting `Record`,
+and read `Record` in sequential order.
+`Record`s are flushed once the maximum cache size is reached. It
+loads `Record`s in batch and returns them to users one by one through the
+[Next](http://singa.incubator.apache.org/api/classsinga_1_1DataShard.html) 
function.
+The disk folder in which the `Record`s are stored, is called a (data) shard. 
The
+[ShardDataLayer](http://singa.incubator.apache.org/api/classsinga_1_1ShardDataLayer.html)
 is a built-in
+layer for loading `Record`s from `DataShard`.
+
+To create data shards for users' own data, they can follow the subsequent 
sections.
+
+###  User record definition
+
+Users define their own record for storing their data. E.g., the built-in
+[SingleLabelImageRecord](http://singa.incubator.apache.org/api/classsinga_1_1SingleLabelImageRecord.html)
+has an int field for image label, and a pixel array for image RGB values.
+The code below shows an example of defining a new record `UserRecord`, and 
extending the
+base `Record` to include `UserRecord`.
+
+
+    package singa;
+
+    import "common.proto";  // required to import common.proto
+
+    message UserRecord {
+        repeated int userVAR1 = 1;    // unique field id
+        optional string userVAR2 = 2; // unique field id
+        ...
+    }
+
+    extend Record {
+        optional UserRecord user_record = 101;  // unique extension field id, 
reserved for users (e.g., 101-200)
+    }
+
+Please refer to the
+[Tutorial](https://developers.google.com/protocol-buffers/docs/reference/cpp-generated?hl=en#extension)
+for extension of protocol messages.
+
+The extended `Record` will be parsed by a parser layer to extract features
+(e.g., label or pixel values). Users need to write
+their own [parser 
layers](http://singa.incubator.apache.org/docs/layer#parser-layers) to parse the
+extended `Record`.
+
+{% comment %}
+*Note*
+
+There is an alternative way to define the proto extension.
+In this way, you should be careful of the scope of fields and how to access the
+fields, which are different from the above.
+
+    message UserRecord {
+        extend Record {
+            optional UserRecord user_record = 101;  // unique extension field 
id, reserved for users (e.g., 101-200)
+        }
+        repeated int userVAR1 = 1; // unique field id
+        optional string userVAR2 = 2; // unique field id
+        ...
+    }
+{% endcomment %}
+
+
+###  DataShard creation
+
+Users write code to convert their data into `Record`s and insert them into 
shards
+following the subsequent steps.
+
+1. Create a folder *USER_DATA* under *SINGA_ROOT*.
+
+2. Prepare the source file, e.g., `create_shard.cc`,  in `SINGA_ROOT/USER_DATA`
+
+        singa::DataShard myShard(outputpath, kCreate);
+        // outputpath is the path of the folder for storing the shard
+
+    the above code opens a folder for storing the data shard.
+
+        singa::Record record;
+        singa::UserRecord* r = record.MutableExtension(singa::user_record);
+
+    an user-defined record is allocated by the above code.
+
+        r->add_userVAR1( int_val );     // for repeated field
+        r->set_userVAR2( string_val );
+
+    users load raw data and set/add them into user-defined record as shown 
above.
+
+        // key (string) is a unique record ID (e.g., converted from a number 
starting from 0)
+        myShard.Insert( key, record );
+
+    Once the `record` object is filled, it is inserted into the shard as shown 
above.
+    If there are multiple data records, they should be inserted sequentially.
+    After inserting all records, the shard is created into the `outputpath` 
folder.
+
+3. Compile and link. Both *user.proto* and *create.cc* should be compiled and 
linked with libsinga.so.
+  The following instruction generates *user.pb.cc* and *user.pb.h* from 
*user.proto*.
+
+        protoc -I=SINGA_ROOT/USER_DATA --cpp_out=SINGA_ROOT/USER_DATA 
user.proto
+
+    All code can be compiled and linked into an executable file
+
+        g++ create_shard.cc user.pb.cc -std=c++11 -lsinga \
+          -ISINGA_ROOT/include -LSINGA_ROOT/.libs/ 
-Wl,-unresolved-symbols=ignore-in-shared-libs \
+          -Wl,-rpath=SINGA_ROOT/.libs/  -o create_shard.bin
+
+
+4. Run the program. Once the executable file is generated, users can run it to 
create data shards.
+
+        ./create_shard.bin  <args>
+
+
+### Example - CIFAR dataset
+
+This example uses the [CIFAR-10 image 
dataset](http://www.cs.toronto.edu/~kriz/cifar.html) collected by Alex 
Krizhevsky.
+It consists of 60,000 32x32 color images in 10 classes, with 6,000 images per 
class.
+There are 50,000 training images and 10,000 test images.
+Each image has a single label. This dataset is stored in binary files with 
specific format.
+SINGA has written the 
[create_shard.cc](https://github.com/apache/incubator-singa/blob/master/examples/cifar10/create_shard.cc)
+to convert images in the binary files into `Record`s and insert them into 
training and test shards.
+
+1. Download raw data. The following command will download the dataset into 
*cifar-10-batches-bin* folder.
+
+        # in SINGA_ROOT/examples/cifar10
+        $ cp Makefile.example Makefile   // an example makefile is provided
+        $ make download
+
+2. Since `Record` already has one `image` field which is designed for
+  single-label images, e.g., images from CIFAR10, we can use it directly.
+  Particularly, the type of `image` is `SingleLabelImageRecord`,
+
+
+        # common.proto
+        package singa;
+
+        message Record {
+          enum Type {
+            kSingleLabelImage = 0;
+          }
+          optional Type type = 1 [default = kSingleLabelImage];
+          optional SingleLabelImageRecord image = 2;   // for configuration
+        }
+
+        message SingleLabelImageRecord {
+          repeated int32 shape = 1;                // it obtains 3 (rgb 
channels), 32 (row), 32 (col)
+          optional int32 label = 2;                // label
+          optional bytes pixel = 3;                // pixels
+          repeated float data = 4 [packed = true]; // it is used for 
normalization
+        }
+
+3. Add/Set data into the record, and insert it to shard.
+    `create_shard.cc` reads images (and labels) from the downloaded binary 
files.
+    For each image, it puts the image feature and label into a 
`SingleLabelImageRecord`
+    of `Record`, and then inserts the `Record` into `DataShard`.
+
+          ...// open binary files
+        DataShard train_shard("cifar10_train_datashard", DataShard::kCreate);
+
+        singa::Record record;
+        singa::SingleLabelImageRecord* image = record.mutable_image();;
+        for (int image_id = 0; image_id < 50000; image_id ++) {
+              read_image(&data_file, &label, str_buffer);  // read feature and 
label from binary file
+              image->set_label(label);  // put label
+              image->set_pixel(str_buffer);  // put image feature
+              train_shard.Insert(to_string(image_id), record);  // insert a 
record with unique ID
+        }
+
+    The data shard for testing data is created similarly.
+    In addition, it computes average values (not shown here) of image pixels 
as another `Record`
+    which is directly serialized into *SINGA_ROOT/USER_DATA/image_mean.bin*.
+    The mean values will be used for preprocessing image features.
+    {% comment %}
+        for (int itemid = 0; itemid < kCIFARBatchSize; ++itemid) {
+          const string& pixels = image->pixel();
+          for(int i=0; i<kCIFARImageNBytes; i++)
+            mean.set_data(i, mean.data(i)+static_cast<uint8_t>(pixels[i]));
+          count += 1;
+        }
+        for(int i=0; i<kCIFARImageNBytes; i++)
+          mean.set_data(i, mean.data(i)/count);
+    {% endcomment %}
+
+4. Compile and run the program. SINGA provides an example Makefile that 
contains instructions
+    for compiling the source code and linking it with *libsinga.so*. Users 
just execute the following command.
+
+        $ make create
+
+    The data shards for training and testing will be generated into
+    *cifar10_train_shard* and *cifar10_test_shard* folders respectively.
+
+### Example - MNIST dataset
+
+This example creates `DataShard`s for the [MNIST 
dataset](http://yann.lecun.com/exdb/mnist/).
+It has a training set of 60,000 handwritten digit images, and a test set of 
10,000 images.
+Similar to the images of CIFAR10, each MNIST image has a single label. Hence, 
we still
+use the built-in `Record`. The process is almost the same as that for
+the CIFAR10 dataset, except that the MNIST dataset is downloaded as binary 
files with
+another format. SINGA has written the *create_shard.cc* program to parse the 
binary files
+and convert MNIST images into `Record`s.
+
+The following command will download the dataset
+
+    $ cp Makefile.example Makefile   // an example makefile is provided
+    $ make download
+
+Data shards will be generated into *mnist_train_shard* and *mnist_test_shard* 
by
+
+    $ make create
+
+## LMDB
+
+To be filled soon.
+
+## HDFS
+
+To be filled soon.
+

Modified: 
incubator/singa/site/trunk/content/markdown/docs/distributed-training.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/distributed-training.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/distributed-training.md 
(original)
+++ incubator/singa/site/trunk/content/markdown/docs/distributed-training.md 
Wed Sep  2 07:59:20 2015
@@ -1,14 +1,19 @@
-## Distributed Training
+---
+layout: post
+title:  Distributed Training
+category : docs
+tags : [cluster topology, training frameworks]
+---
+{% include JB/setup %}
 
-___
 
 SINGA is designed for distributed training of large deep learning models with
 huge amount of training data.
 
 Here we introduce distrbuted SINGA in following aspects:
 
-* [System Architecture](architecture.html)
+* [System Architecture](http://singa.incubator.apache.org/docs/architecture)
 
-* [Training Frameworks](frameworks.html)
+* [Training Frameworks](http://singa.incubator.apache.org/docs/frameworks)
 
-* [System Communication](communication.html)
+* [System Communication](http://singa.incubator.apache.org/docs/communication)

Modified: incubator/singa/site/trunk/content/markdown/docs/frameworks.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/frameworks.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/frameworks.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/frameworks.md Wed Sep  2 
07:59:20 2015
@@ -1,6 +1,11 @@
-## Distributed Training Frameworks
+---
+layout: post
+title:  Distributed Training Frameworks
+category : docs
+tags : [rnn, example]
+---
+{% include JB/setup %}
 
-___
 
 ### Cluster Topology Configuration
 
@@ -16,10 +21,10 @@ The `cluster` is of type `ClusterProto`:
       optional int32 nservers_per_group = 4 [default = 1];
       optional int32 nworkers_per_procs = 5 [default = 1];
       optional int32 nservers_per_procs = 6 [default = 1];
-    
+
       // servers and workers in different processes?
       optional bool server_worker_separate = 20 [default = false];
-      
+
       ......
     }
 
@@ -42,7 +47,7 @@ both **synchronous** and **asynchronous*
 Here we illustrate how to configure
 popular distributed training frameworks in SINGA.
 
-<img src="../images/distributed/frameworks.png" style="width: 800px"/>
+<img src="http://singa.incubator.apache.org/assets/image/frameworks.png"; 
style="width: 800px"/>
 <p><strong> Fig.1 - Training frameworks in SINGA</strong></p>
 
 ####Sandblaster
@@ -50,7 +55,7 @@ popular distributed training frameworks
 This is a **synchronous** framework used by Google Brain.
 Fig.2(a) shows the Sandblaster framework implemented in SINGA.
 Its configuration is as follows:
-    
+
     cluster {
         nworker_groups: 1
         nserver_groups: 1
@@ -63,13 +68,13 @@ A single server group is launched to han
 A worker computes on its partition of the model,
 and only communicates with servers handling related parameters.
 
-    
+
 ####AllReduce
 
 This is a **synchronous** framework used by Baidu's DeepImage.
 Fig.2(b) shows the AllReduce framework implemented in SINGA.
 Its configuration is as follows:
-    
+
     cluster {
         nworker_groups: 1
         nserver_groups: 1
@@ -87,7 +92,7 @@ collecting updates from all other nodes.
 This is a **asynchronous** framework used by Google Brain.
 Fig.2(c) shows the Downpour framework implemented in SINGA.
 Its configuration is as follows:
-    
+
     cluster {
         nworker_groups: 2
         nserver_groups: 1
@@ -106,7 +111,7 @@ from the last *update* response.
 This is a **asynchronous** framework used by Caffe.
 Fig.2(d) shows the Distributed Hogwild framework implemented in SINGA.
 Its configuration is as follows:
-    
+
     cluster {
         nworker_groups: 3
         nserver_groups: 3
@@ -118,5 +123,5 @@ Its configuration is as follows:
 Each node contains a complete server group and a complete worker group.
 Parameter updates are done locally, so that communication cost
 during each training step is minimized.
-However, the server group must periodically synchronize with 
+However, the server group must periodically synchronize with
 neighboring groups to improve the training convergence.

Modified: incubator/singa/site/trunk/content/markdown/docs/installation.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/installation.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/installation.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/installation.md Wed Sep  2 
07:59:20 2015
@@ -1,20 +1,22 @@
-## Installation
+---
+layout: post
+title: Installation
+category: docs
+---
+{% include JB/setup %}
 
-___
 
 ### Dependencies
 
-SINGA is developed and tested on Linux platforms with the following external 
libraries.
+SINGA is developed and tested on Linux platforms.
 
 The following dependenies are required:
 
-  * gflags version 2.1.1, use the default setting for namespace (i.e., gflags).
+  * glog version 0.3.3
 
-  * glog version 0.3.3.
+  * google-protobuf version 2.6.0
 
-  * google-protobuf version 2.6.0.
-
-  * openblas version >= 0.2.10.
+  * openblas version >= 0.2.10
 
   * zeromq version >= 3.2
 
@@ -25,51 +27,72 @@ The following dependenies are required:
 
 Optional dependencies include:
 
-  * gtest version 1.7.0.
+  * gtest version 1.7.0
 
-  * opencv version 2.4.9.
+  * opencv version 2.4.9
 
   * lmdb version 0.9.10
 
 
-Tips:
-For libraries like openblas, opencv, older versions may also work, because we 
do not use any newly added features.
+SINGA comes with a script for installing the external libraries (see below).
+
+### Building SINGA from source
+
+SINGA is built using GNU autotools. GCC (version >= 4.8) is required.
+There are three ways to build SINGA,
+
+  * If you want to use the latest code, please clone it from
+  [Github](https://github.com/apache/incubator-singa.git) and execute
+  the following commands,
+
+        $ git clone [email protected]:apache/incubator-singa.git
+        $ cd incubator-singa
+        $ ./autogen.sh
+        $ ./configure
+        $ make
+
+
+  * If you download a release package, please follow the instructions below,
+
+        $ tar xvf singa-xxx
+        $ cd singa-xxx
+        $ ./configure
+        $ make
+
+    Some features of SINGA depend on external libraries. These features can be
+    compiled with `--enable-<feature>`.
+    For example, to build SINGA with lmdb support, you can run:
 
+        $ ./configure --enable-lmdb
 
-### Building SINGA From Source
 
-The build system of SINGA is based on GNU autotools. To build singa, you need 
gcc version >= 4.8.
-The common steps to build SINGA can be:
+  * In case you do not have the GNU auto tools to run `autogen.sh`, SINGA
+  provides a Makefile.example file, which is used as
 
-       1.Extract source files;
-       2.Run configure script to generate makefiles;
-       3.Build and install SINGA.
+        $ cp Makefile.example Makefile
+        $ make
 
-On Unix-like systems with GNU Make as build tool, these build steps can be 
summarized by the following sequence of commands executed in a shell.
+    Code depending on lmdb can be added into the compilation by
 
-    $ cd SINGA/FOLDER
-    $ ./configure
-    $ make
-    $ make install
+        make -DUSE_LMDB
 
-After executing above commands, SINGA library will be installed in the system 
default directory.
-If you want to specify your own installation directory, use the following 
command instead.
 
-    $ ./configure --prefix=/YOUR/OWN/FOLDER
+After compiling SINGA successfully, the `libsinga.so` will be generated into
+.lib/ folder and an executable file `singa` is generated under bin/.
 
-The result of configure script will indicate you whether there exist 
dependency missings in your system.
-If you do not install the dependencies, you can run the following commands.
-To download & install the thirdparty dependencies:
+If some dependent libraries are missing (or not detected), you can use the
+following script to download and install them:
 
     $ cd thirdparty
     $ ./install.sh MISSING_LIBRARY_NAME1 YOUR_INSTALL_PATH1 
MISSING_LIBRARY_NAME2 YOUR_INSTALL_PATH2 ...
 
-If you do not specify the installation path, the library will be installed in 
default folder.
-For example, if you want to build zeromq library in system folder and gflags 
in /usr/local, just run:
+If you do not specify the installation path, the library will be installed in
+the default folder specified by the software itself.  For example, if you want
+to build `zeromq` library in system folder and `gflags` in `/usr/local`, just 
run:
 
     $ ./install.sh zeromq gflags /usr/local
 
-Another example can be to install all dependencies in /usr/local directory:
+You can also install all dependencies in `/usr/local` directory:
 
     $ ./install.sh all /usr/local
 
@@ -87,34 +110,36 @@ Here is a table showing the first argume
     zeromq                zeromq lib
     zookeeper             Apache zookeeper
 
-*: Since czmq depends on zeromq, the script offers you one more argument to 
indicate zeromq location.
-The installation commands of czmq can be:
+*: Since `czmq` depends on `zeromq`, the script offers you one more argument to
+indicate `zeromq` location.
+The installation commands of `czmq` is:
 
     $./install.sh czmq  /usr/local /usr/local/zeromq
 
-After the execution, czmq will be installed in /usr/local while zeromq is 
installed in /usr/local/zeromq.
+After the execution, `czmq` will be installed in `/usr/local` while `zeromq` is
+installed in `/usr/local/zeromq`.
 
 ### FAQ
 
-Q1:While compiling Singa and installing glog on max OS X, I get fatal error
-"'ext/slist' file not found"
+Q1:While compiling SINGA and installing `glog` on max OS X, I get fatal error
+`'ext/slist' file not found`
 
-A1:You may install glog individually and try command :
+A1:Please install `glog` individually and try :
 
     $ make CFLAGS='-stdlib=libstdc++' CXXFLAGS='stdlib=libstdc++'
 
 
-Q2:While compiling Singa, I get error "SSE2 instruction set not enabled"
+Q2:While compiling SINGA, I get error `SSE2 instruction set not enabled`
 
 A2:You can try following command:
 
     $ make CFLAGS='-msse2' CXXFLAGS='-msse2'
 
-Q3:I get error "./configure --> cannot find blas_segmm() function" even I
-run "install.sh OpenBLAS".
+Q3:I get error `./configure --> cannot find blas_segmm() function` even I
+run `install.sh OpenBLAS`.
 
-A3:Since OpenBLAS library is installed in /opt folder by default or
-/other/folder by your preference, you may edit your environment settings.
+A3:Since `OpenBLAS` library is installed in `/opt` folder by default or
+`/other/folder` by your preference, you may edit your environment settings.
 You need add its default installation directories before linking, just
 run:
 
@@ -123,8 +148,8 @@ run:
 Or as an alternative option, you can also edit LIBRARY_PATH to figure it out.
 
 
-Q4:I get ImportError from google.protobuf.internal when I try to import .py
-files. (ImportError: cannot import name enum_type_wrapper)
+Q4:I get `ImportError: cannot import name enum_type_wrapper` from
+google.protobuf.internal when I try to import .py files.
 
 A4:After install google protobuf by "make install", we should install python
 runtime libraries. Go to protobuf source directory, run:
@@ -134,5 +159,5 @@ runtime libraries. Go to protobuf source
     $ python setup.py build
     $ python setup.py install
 
-You may need "sudo" when you try to install python runtime libraries in
-system folder.
+You may need `sudo` when you try to install python runtime libraries in
+the system folder.

Modified: incubator/singa/site/trunk/content/markdown/docs/layer.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/layer.md?rev=1700722&r1=1700721&r2=1700722&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/layer.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/layer.md Wed Sep  2 
07:59:20 2015
@@ -1,333 +1,500 @@
-# Layers Instruction
-### ShardData Layer
-ShardData layer is used to read data from disk etc.
-
-       layer   
-       {
-               name:"data"
-               type:"kShardData"
-               data_param
-               {
-                       path:"Shard_File_Path"
-                       batchsize:int
-               }
-               exclude:kTrain|kValidation|kTest|kPositive|kNegative
-       }
-
-
-### Label Layer
-Label layer is used to extract the label information from training data.
-The label information will be used in the loss layer to calculate the gradient.
- 
-    layer
-    {
-       name:"label"
-       type:"kLabel"
-       srclayers:"data"
-    }
-
-### Convolution Layer  
-Convolution layer is a basic layer used in constitutional neural net. 
-It is used to extract local feature following some local patterns from slide 
windows in the image.
-
-    layer
-    {
-       name:"Conv_Number"
-       type:"kConvolution"
-       srclayers:"Src_Layer_Name"
-       convolution_param
-       {
-               num_filters:int
-               //the count of the applied filters
-               kernel:int
-               //convolution kernel size
-               stride:int
-               //the distance between the successive filters
-               pad:int
-               //pad the images with a given int number of pixels border of 
zeros
-       }
-       param
-       {
-               name:"weight"
-               
init_method:kGaussian|kConstant:kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-               /*use specific param of each init methods*/
-               learning_rate_multiplier:float
-       }
-       param
-       {
-               name:"bias"
-               
init_method:kConstant|kGaussian|kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-               /**use specific param of each init methods**/
-               learning_rate_multiplier:float
-       }
-       //kGaussian: sample Gaussian with std and mean
-       //kUniform: uniform sampling between low and high
-       //kPretrained: from Toronto Convnet, let a=1/sqrt(fan_in),w*=a after 
generating from Gaussian distribution
-       //kGaussianSqrtFanIn: from Toronto Convnet, rectified linear 
activation, 
-               //let a=sqrt(3)/sqrt(fan_in),range is [-a,+a].
-               //no need to set value=sqrt(3),the program will multiply it
-       //kUniformSqrtFanIn: from Theano MLP tutorial, let 
a=1/sqrt(fan_in+fan_out).
-               //for tanh activation, range is [-6a,+6a], for sigmoid 
activation.
-               // range is [-24a,+24a],put the scale factor to value field
-       //For Constant Init, use value:float
-       //For Gaussian Init, use mean:float, std:float
-       //For Uniform Init, use low:float, high:float
+---
+layout: post
+title: Layer
+category : docs
+tags : [layer ]
+---
+{% include JB/setup %}
+Layer is a core abstraction in SINGA. It performs a variety of feature
+transformations for extracting high-level features, e.g., loading raw features,
+parsing RGB values, doing convolution transformation, etc.
+
+The *Basic user guide* section introduces the configuration of a built-in
+layer. *Advanced user guide* explains how to extend the base Layer class to
+implement users' functions.
+
+## Basic user guide
+
+### Layer configuration
+
+The configurations of three layers from the [MLP 
example](http://singa.incubator.apache.org/docs/mlp) is shown below,
+
+    layer {
+      name: "data"
+      type: kShardData
+      sharddata_conf { }
+      exclude: kTest
+      partition_dim : 0
+    }
+    layer{
+      name: "mnist"
+      type: kMnist
+      srclayers: "data"
+      mnist_conf { }
+    }
+    layer{
+      name: "fc1"
+      type: kInnerProduct
+      srclayers: "mnist"
+      innerproduct_conf{ }
+      param{ }
+      param{ }
+    }
+
+There are some common fields for all kinds of layers:
+
+  * `name`: a string used to differentiate two layers.
+  * `type`: an integer used for identifying a Layer subclass. The types of 
built-in
+  layers are listed in LayerType (defined in job.proto).
+  For user-defined layer subclasses, `user_type` of string should be used 
instead of `type`.
+  The detail is explained in the [last section](#newlayer) of this page.
+  * `srclayers`: one or more layer names, for identifying the source layers.
+  In SINGA, all connections are 
[converted](http://singa.incubator.apache.org/docs/neural-net) to directed 
connections.
+  * `exclude`: an enumerate value of type [Phase](), can be {kTest, 
kValidation,
+  kTrain}. It is used to filter this layer when creating the
+  [NeuralNet](http://singa.incubator.apache.org/docs/neural-net) for the 
excluding phase. E.g.,
+  the "data" layer would be filtered when creating the NeuralNet instance for 
test phase.
+  * `param`: configuration for a 
[Param](http://singa.incubator.apache.org/docs/param) instance.
+  There can be multiple Param objects in one layer.
+  * `partition_dim`: integer value indicating the partition dimension of this
+  layer. -1 (the default value) for no partitioning, 0 for partitioning on 
batch dimension, 1 for
+  partitioning on feature dimension. It is used by
+  [CreateGraph](http://singa.incubator.apache.org/docs/neural-net) for 
partitioning the neural net.
+
+Different layers may have different configurations. These configurations
+are defined in `<type>_conf`.  E.g., the "data" layer has `sharddata_conf` and 
"fc1" layer has
+`innerproduct_conf`. The subsequent sections
+explain the functionality of each built-in layer and how to configure it,
+
+### Built-in Layer subclasses
+SINGA has provided many built-in layers, which can be used directly to create 
neural nets.
+These layers are categorized according to their functionalities,
+
+  * Data layers for loading records (e.g., images) from [disk], HDFS or 
network into memory.
+  * Parser layers for parsing features, labels, etc. from records, into 
[Blob](http://singa.incubator.apache.org/api/classsinga_1_1Blob.html).
+  * Neuron layers for feature transformation, e.g., 
[convolution](http://singa.incubator.apache.org/api/classsinga_1_1ConvolutionLayer.html),
 
[pooling](http://singa.incubator.apache.org/api/classsinga_1_1PoolingLayer.html),
 dropout, etc.
+  * Loss layers for measuring the training objective loss, e.g., [cross 
entropy-loss] or [Euclidean loss].
+  * Output layers for outputting the prediction results (e.g., probabilities 
of each category) onto disk or network.
+  * Connection layers for connecting layers when the neural net is partitioned.
+
+####Data Layers
+
+Data layers load training/testing data and convert them into 
[Record](http://singa.incubator.apache.org/docs/data)s, which
+are parsed by parser layers. The data source can be disk file, HDFS, database 
or network.
+
+##### ShardDataLayer
+
+[ShardDataLayer](http://singa.incubator.apache.org/api/classsinga_1_1ShardDataLayer.html)
 is used to read data from disk file. The file should be created
+using 
[DataShard](http://singa.incubator.apache.org/api/classsinga_1_1DataShard.html) 
class. With the data file prepared, users configure the layer as
+
+    type: kShardData
+    sharddata_conf {
+      path: "path to data shard folder"
+      batchsize: int
+      random_skip: int
+    }
+
+`batchsize` specifies the number of records to be trained for one mini-batch.
+The first `rand() % random_skip` `Record`s will be skipped at the first
+iteration. This is to enforce that different workers work on different Records.
+
+##### LMDBDataLayer
+
+[LMDBDataLayer] is similar to ShardDataLayer, except that the Records are
+loaded from LMDB.
+
+    type: kLMDBData
+    lmdbdata_conf {
+      path: "path to LMDB folder"
+      batchsize: int
+      random_skip: int
+    }
+
+#### Parser Layers
+
+Parser layers get a vector of Records from data layers and parse features into
+a Blob.
+
+    virtual void ParseRecords(Phase phase, const vector<Record>& records, 
Blob<float>* blob) = 0;
+
+
+##### LabelLayer
+
+[LabelLayer](http://singa.incubator.apache.org/api/classsinga_1_1LabelLayer.html)
 is used to parse a single label from each Record. Consequently, it
+will put $b$ (mini-batch size) values into the Blob. It has no specific 
configuration fields.
+
+
+##### MnistImageLayer
+[MnistImageLayer] parses the pixel values of each image from the MNIST 
dataset. The pixel
+values may be normalized as `x/norm_a - norm_b`. For example, if `norm_a` is
+set to 255 and `norm_b` is set to 0, then every pixel will be normalized into
+[0, 1].
+
+    type: kMnistImage
+    mnistimage_conf {
+      norm_a: float
+      norm_b: float
+    }
+
+##### RGBImageLayer
+[RGBImageLayer](http://singa.incubator.apache.org/api/classsinga_1_1RGBImageLayer.html)
 parses the RGB values of one image from each Record. It may also
+apply some transformations, e.g., cropping, mirroring operations. If the
+`meanfile` is specified, it should point to a path that contains one Record for
+the mean of each pixel over all training images.
+
+    type: kRGBImage
+    rgbimage_conf {
+      scale: float
+      cropsize: int  # cropping each image to keep the central part with this 
size
+      mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+      meanfile: "Image_Mean_File_Path"
+    }
+{% comment %}
+#### PrefetchLayer
+
+[PrefetchLayer](http://singa.incubator.apache.org/api/classsinga_1_1PrefetchLayer.html)
 embeds data layers and parser layers to do data prefeching.
+It will launch a thread to call the data layers and parser layers to load and 
extract features.
+It ensures that the I/O task and computation task can work simultaneously.
+One example PrefetchLayer configuration is,
+
+    layer {
+      name: "prefetch"
+      type: kPrefetch
+      sublayers {
+        name: "data"
+        type: kShardData
+        sharddata_conf { }
+      }
+      sublayers {
+        name: "rgb"
+        type: kRGBImage
+        srclayers:"data"
+        rgbimage_conf { }
+      }
+      sublayers {
+        name: "label"
+        type: kLabel
+        srclayers: "data"
+      }
+      exclude:kTest
+    }
+
+The layers on top of the PrefetchLayer should use the name of the embedded
+layers as their source layers. For example, the "rgb" and "label" should be
+configured to the `srclayers` of other layers.
+
+{% endcomment %}
+
+#### Neuron Layers
+
+Neuron layers conduct feature transformations.
+
+##### ConvolutionLayer
+
+[ConvolutionLayer](http://singa.incubator.apache.org/api/classsinga_1_1ConvolutionLayer.html)
 conducts convolution transformation.
+
+    type: kConvolution
+    convolution_conf {
+      num_filters: int
+      kernel: int
+      stride: int
+      pad: int
     }
- 
-Input:n * c_i * h_i * w_i
+    param { } # weight/filter matrix
+    param { } # bias vector
+
+The int value `num_filters` stands for the count of the applied filters; the 
int
+value `kernel` stands for the convolution kernel size (equal width and height);
+the int value `stride` stands for the distance between the successive filters;
+the int value `pad` pads each with a given int number of pixels border of
+zeros.
+
+##### InnerProductLayer
+
+[InnerProductLayer](http://singa.incubator.apache.org/api/classsinga_1_1InnerProductLayer.html)
 is fully connected with its (single) source layer.
+Typically, it has two parameter fields, one for weight matrix, and the other
+for bias vector. It rotates the feature of the source layer (by multiplying 
with weight matrix) and
+shifts it (by adding the bias vector).
+
+    type: kInnerProduct
+    innerproduct_conf {
+      num_output: int
+    }
+    param { } # weight matrix
+    param { } # bias vector
+
+
+##### PoolingLayer
+
+[PoolingLayer](http://singa.incubator.apache.org/api/classsinga_1_1PoolingLayer.html)
 is used to do a normalization (or averaging or sampling) of the
+feature vectors from the source layer.
+
+    type: kPooling
+    pooling_conf {
+      pool: AVE|MAX // Choose whether use the Average Pooling or Max Pooling
+      kernel: int   // size of the kernel filter
+      pad: int      // the padding size
+      stride: int   // the step length of the filter
+    }
+
+The pooling layer has two methods: Average Pooling and Max Pooling.
+Use the enum AVE and MAX to choose the method.
+
+  * Max Pooling selects the max value for each filtering area as a point of the
+  result feature blob.
+  * Average Pooling averages all values for each filtering area at a point of 
the
+    result feature blob.
+
+##### ReLULayer
+
+[ReLuLayer](http://singa.incubator.apache.org/api/classsinga_1_1ReLULayer.html)
 has rectified linear neurons, which conducts the following
+transformation, `f(x) = Max(0, x)`. It has no specific configuration fields.
+
+##### TanhLayer
 
-Output:n * c_o * h_o * w_o,h_o = (h_i + 2 * pad_h - kernel_h) /stride_h + 1
+[TanhLayer](http://singa.incubator.apache.org/api/classsinga_1_1TanhLayer.html)
 uses the tanh as activation function, i.e., `f(x)=tanh(x)`.
+It has no specific configuration fields.
 
-### Dropout Layer
-Dropout Layer is a layer that randomly dropout some inputs.
+##### SigmoidLayer
+
+[SigmoidLayer] uses the sigmoid (or logistic) as activation function, i.e.,
+`f(x)=sigmoid(x)`.  It has no specific configuration fields.
+
+
+##### Dropout Layer
+[DropoutLayer](http://singa.incubator.apache.org/api/asssinga_1_1DropoutLayer.html)
 is a layer that randomly dropouts some inputs.
 This scheme helps deep learning model away from over-fitting.
 
-### InnerProduct Layer  
-InnerProduct Layer is a fully connected layer which is the basic element in 
feed forward neural network.
-It will use the lower layer as a input vector V and output a vector H by doing 
the following matrix-vector multiplication:
-
-H = W*V + B // W and B are its weight and bias parameter
-
-    layer
-    {
-       name:"IP_Number"
-       type:"kInnerProduct"
-       srclayers:"Src_Layer_Name"
-       inner_product_param
-       {
-               num_output:int
-               //The number of the filters
-       }
-       param
-       {
-               name:"weight"
-               
init_method:kGaussian|kConstant:kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-               std:float
-               //
-               learning_rate_multiplier:float
-               //                              
-               weight_decay_multiplier:int
-               //                      
-               /*low:float,high:float*/
-               //
-       }
-       param
-       {
-               name:"bias"             
-               
init_method:kConstant|kGaussian|kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
                          
-               learning_rate_mulitiplier:float
-               //                              
-               weight_decay_multiplier:int
-               //
-               value:int
-               //                              
-               /*low:float,high:float*/
-               //
-       }
-    }
-
-       Input:n * c_i * h_i * w_i
-       Output:n * c_o * 1 *1
-
-### LMDBData Layer  
-This is a data input layer, the data will be provided by the LMDB.
-
-    layer
-    {
-       name:"data"
-       type:"kLMDBDate"
-       data_param
-       {
-               path:"LMDB_FILE_PATH"
-               batchsize:int
-            //batchsize means the quantity of the input disposable  
-       }
-       exclude:kTrain|kValidation|kTest|kPositive|kNegative
-    }
-
-
-### LRN Layer  
-
-Local Response Normalization normalizes over the local input areas. 
-It provides two modes: WITHIN_CHANNEL and ACROSS_CHANNELS. 
-The local response normalization layer performs a kind of “lateral 
inhibition” by normalizing over local input regions. 
-In ACROSS_CHANNELS mode, the local regions extend across nearby channels, 
-but have no spatial extent (i.e., they have shape local_size x 1 x 1). 
-In WITHIN_CHANNEL mode, the local regions extend spatially, 
-but are in separate channels (i.e., they have shape 1 x local_size x 
local_size). 
-Each input value is divided by ![](http://i.imgur.com/GgTjjtR.png), 
-where n is the size of each local region, and the sum is taken over the region 
centered at that value 
-(zero padding is added where necessary).
-
-
-    layer
-    {
-       name:"Norm_Number"
-       type:"kLRN"
-       lrn_param
-       {
-               norm_region:WITHIN_CHANNEL|ACROSS_CHANNELS
-               local_size:int
-                       //for WITHIN_CHANNEL, it means the side length of the 
space region which will be summed up
-                       //for ACROSS_CHANNELS, it means the quantity of the 
adjoining channels which will be summed up
-               alpha:5e-05
-               beta:float
-       }
-       srclayers:"Src_Layer_Name"
-    }
-
-### MnistImage Layer
-
-MnistImage is a pre-processing layer for MNIST dataset.
-
-    layer
-    {
-       name:"mnist"
-       type:"kMnistImage"
-       srclayers:"data"
-       mnist_param
-       {
-               sigma:int
-               alpha:int
-               gamma:int
-               kernel:int
-               elastic_freq:int
-               beta:int
-               resize:int
-               norm_a:int
-       }
-    }
-
-### Pooling Layer  
-Max Pooling uses a specific scanning window to find the max value.  
-Average Pooling scans all the values in the window to calculate the average 
value.
-
-    layer
-    {
-       name:"Pool_Number"
-       type:"kPooling"
-       srclayers:"Src_Layer_Name"
-       pooling_param
-       {
-               pool:AVE|MAX
-               //Choose whether use the Average Pooling or Max Pooling
-               kernel:int
-                       //size of the kernel filter
-               stride:int
-                       //the step length of the filter
-       }
-    }
-
-### ReLU Layer  
-  
-The rectifier function is an activation function f(x) = Max(0, x) 
-which can be used by neurons just like any other activation function, 
-a node using the rectifier activation function is called a ReLu node. 
-The main reason that it is used is because of how efficiently it can be 
computed compared to more conventional activation functions like the sigmoid 
and hyperbolic tangent, 
-without making a significant difference to generalization accuracy. 
-The rectifier activation function is used instead of a linear activation 
function to add non linearity to the network, 
-otherwise the network would only ever be able to compute a linear function.
-
-    layer
-    {
-       name:"Relu_Number"
-       type:"kReLU"
-       srclayers:"Src_Layer_Name"
-    }
-
-### RGBImage Layer 
-
-RGBImage layer is a pre-processing layer for RGB format images. 
-
-    layer
-    {
-       name:"rgb"
-       type:"kRGBImage"
-       srclayers:"data"
-       rgbimage_param
-       {
-               meanfile:"Image_Mean_File_Path"
-       }
-    }
-
-### Tanh Layer  
-Tanh uses the tanh as activation function. It transforms the input into range 
[-1, 1] using Tanh function.  
-
-    layer
-    {
-       name:"Tanh_Number"
-       type:"kTanh"
-       srclayer:"Src_Layer_Name"
+    type: kDropout
+    dropout_conf {
+      dropout_ratio: float # dropout probability
+    }
+
+##### LRNLayer
+[LRNLayer](http://singa.incubator.apache.org/api/classsinga_1_1LRNLayer.html), 
(Local Response Normalization), normalizes over the channels.
+
+    type: kLRN
+    lrn_conf {
+      local_size: int
+      alpha: float  // scaling parameter
+      beta: float   // exponential number
     }
 
-### SoftmaxLoss Layer  
-Softmax Loss Layer is the implementation of multi-class softmax loss function.
+`local_size` specifies  the quantity of the adjoining channels which will be 
summed up.
+{% comment %}
+ For `WITHIN_CHANNEL`, it means the side length of the space region which will 
be summed up.
+{% endcomment %}
+
+#### Loss Layers
+
+Loss layers measures the objective training loss.
+
+##### SoftmaxLossLayer
+
+[SoftmaxLossLayer](http://singa.incubator.apache.org/api/classsinga_1_1SoftmaxLossLayer.html)
 is a combination of the Softmax transformation and
+Cross-Entropy loss. It applies Softmax firstly to get a prediction probability
+for each output unit (neuron) and compute the cross-entropy against the ground 
truth.
 It is generally used as the final layer to generate labels for classification 
tasks.
 
-    layer
-    {
-       name:"loss"
-       type:"kSoftmaxLoss"
-       softmaxloss_param
-       {
-               topk:int
-       }
-       srclayers:"Src_Layer_Name"
-       srclayers:"Src_Layer_Name"
-    }
-
-### BridgeSrc & BridgeDst Layer  
-
-BridgeSrc & BridgeDst Layer are utility layers implementing logics of model 
partition.
-It can be used as a lock for synchronization, a transformation storage of 
different type of model partition and etc.
-
-### Concate Layer  
-Concat Layer is used to concatenate the last dimension (namely, num_feature) 
of the output of two nodes. It is usually used along with fully connected layer.
-
-### Parser Layer 
-
-Parser Layer will parse the input records into Blobs. 
-
-### Prefetch Layer
-
-Prefetch Layer is used to pre-fetch data from disk. 
-It ensures that the I/O task and computation/communication task can work 
simultaneously. 
-
-    layer
-    {
-       name:"prefetch"
-       type:"kPrefetch"
-       sublayers
-       {
-               name:"data"
-               type:"kShardData"
-               data_param
-               {
-                       path:"Shard_File_Path"
-                       batchsize:int
-               }
-       }
-       sublayers
-       {
-               name:"rgb"
-               type:"kRGBImage"
-               srclayers:"data"
-               rgbimage_param
-               {
-                       meanfile:"Image_Mean_File_Path"
-               }
-       }
-       sublayers
-       {
-               name:"label"
-               type:"kLabel"
-               srclayers:"data"
-       }
-       exclude:kTrain|kValidation|kTest|kPositive|kNegative
+    type: kSoftmaxLoss
+    softmaxloss_conf {
+      topk: int
+    }
+
+The configuration field `topk` is for selecting the labels with `topk`
+probabilities as the prediction results. It is tedious for users to view the
+prediction probability of every label.
+
+#### Other Layers
+
+##### ConcateLayer
+
+[ConcateLayer](http://singa.incubator.apache.org/api/classsinga_1_1ConcateLayer.html)
 connects more than one source layers to concatenate their feature
+blob along given dimension.
+
+    type: kConcate
+    concate_conf {
+      concate_dim: int  // define the dimension
+    }
+
+##### SliceLayer
+
+[SliceLayer](http://singa.incubator.apache.org/api/classsinga_1_1SliceLayer.html)
 connects to more than one destination layers to slice its feature
+blob along given dimension.
+
+    type: kSlice
+    slice_conf {
+      slice_dim: int
+    }
+
+##### SplitLayer
+
+[SplitLayer](http://singa.incubator.apache.org/api/classsinga_1_1SplitLayer.html)
 connects to more than one destination layers to replicate its
+feature blob.
+
+    type: kSplit
+    split_conf {
+      num_splits: int
+    }
+
+##### BridgeSrcLayer & BridgeDstLayer
+
+[BridgeSrcLayer](http://singa.incubator.apache.org/api/classsinga_1_1BridgeSrcLayer.html)
 & 
[BridgeDstLayer](http://singa.incubator.apache.org/api/classsinga_1_1BridgeDstLayer.html)
 are utility layers assisting data (e.g., feature or
+gradient) transferring due to neural net partitioning. These two layers are
+added implicitly. Users typically do not need to configure them in their neural
+net configuration.
+
+
+## Advanced user guide
+
+The base Layer class is introduced in this section, followed by how to
+implement a new Layer subclass.
+
+### Base Layer class
+
+#### Members
+
+    LayerProto layer_proto_;
+    Blob<float> data_, grad_;
+    vector<Layer*> srclayers_, dstlayers_;
+
+The base layer class keeps the user configuration in `layer_proto_`. Source
+layers and destination layers are stored in `srclayers_` and `dstlayers_`, 
respectively.
+Almost all layers has $b$ (mini-batch size) feature vectors, which are stored
+in the `data_` 
[Blob](http://singa.incubator.apache.org/api/classsinga_1_1Blob.html) (A Blob 
is a chunk of memory space, proposed in
+[Caffe](http://caffe.berkeleyvision.org/)).
+There are layers without feature vectors; instead, they use other
+layers' feature vectors. In this case, the `data_` field is not used.
+The `grad_` Blob is for storing the gradients of the
+objective loss w.r.t. the `data_` Blob. It is necessary in [BP 
algorithm](http://singa.incubator.apache.org/api/classsinga_1_1BPWorker.html),
+hence we put it as a member of the base class. For [CD 
algorithm](http://singa.incubator.apache.org/api/classsinga_1_1CDWorker.html), 
the `grad_`
+field is not used; instead, the layer from RBM may have a Blob for the positive
+phase feature and a Blob for the negative phase feature. For a recurrent layer
+in RNN, the feature blob contains one vector per internal layer.
+
+If a layer has parameters, these parameters are declared using type
+[Param](http://singa.incubator.apache.org/docs/param). Since some layers do 
not have
+parameters, we do not declare any `Param` in the base layer class.
+
+#### Functions
+
+    virtual void Setup(const LayerProto& proto, int npartitions = 1);
+    virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+    virtual void ComputeGradient(Phase phase) = 0;
+
+The `Setup` function reads user configuration, i.e. `proto`, and information
+from source layers, e.g., mini-batch size,  to set the
+shape of the `data_` (and `grad_`) field as well
+as some other layer specific fields. If `npartitions` is larger than 1, then
+users need to reduce the sizes of `data_`, `grad_` Blobs or Param objects. For
+example, if the `partition_dim=0` and there is no source layer, e.g., this
+layer is a (bottom) data layer, then its `data_` and `grad_` Blob should have
+`b/npartitions` feature vectors; If the source layer is also partitioned on
+dimension 0, then this layer should have the same number of feature vectors as
+the source layer. More complex partition cases are discussed in
+[Neural net 
partitioning](http://singa.incubator.apache.org/docs/neural-net/#neural-net-partitioning).
 Typically, the
+Setup function just set the shapes of `data_` Blobs and Param objects. Memory
+will not be allocated until computation over the data structure happens.
+
+The `ComputeFeature` function evaluates the feature blob by transforming (e.g.
+convolution and pooling) features from the source layers.  `ComputeGradient`
+computes the gradients of parameters associated with this layer.  These two
+functions are invoked by the 
[TrainOneBatch](http://singa.incubator.apache.org/docs/train-one-batch)
+function during training. Hence, they should be consistent with the
+`TrainOneBatch` function. Particularly, for feed-forward and RNN models, they 
are
+trained using [BP 
algorithm](http://singa.incubator.apache.org/docs/train-one-batch/#back-propagation),
+which requires each layer's `ComputeFeature`
+function to compute `data_` based on source layers, and requires each layer's
+`ComputeGradient` to compute gradients of parameters and source layers'
+`grad_`. For energy models, e.g., RBM, they are trained by
+[CD 
algorithm](http://singa.incubator.apache.org/docs/train-one-batch/#contrastive-divergence),
 which
+requires each layer's `ComputeFeature` function to compute the feature vectors
+for the positive phase or negative phase depending on the `phase` argument, and
+requires the `ComputeGradient` function to only compute parameter gradients.
+For some layers, e.g., loss layer or output layer, they can put the loss or
+prediction result into the `metric` argument, which will be averaged and
+displayed periodically.
+
+### Implementing a new Layer subclass
+
+Users can extend the base layer class to implement their own feature 
transformation
+logics as long as the two virtual functions are overridden to be consistent 
with
+the `TrainOneBatch` function. The `Setup` function may also be overridden to
+read specific layer configuration.
+
+#### Layer specific protocol message
+
+To implement a new layer, the first step is to define the layer specific
+configuration. Suppose the new layer is `FooLayer`, the layer specific
+google protocol message `FooLayerProto` should be defined as
+
+    # in user.proto
+    package singa
+    import "job.proto"
+    message FooLayerProto {
+      optional int32 a = 1;  // specific fields to the FooLayer
+    }
+
+In addition, users need to extend the original `LayerProto` (defined in 
job.proto of SINGA)
+to include the `foo_conf` as follows.
+
+    extend LayerProto {
+      optional FooLayerProto foo_conf = 101;  // unique field id, reserved for 
extensions
     }
 
-### Slice Layer    
-The Slice layer is a utility layer that slices an input layer to multiple 
output layers along a given dimension (currently num or channel only) with 
given slice indices.
+If there are multiple new layers, then each layer that has specific
+configurations would have a `<type>_conf` field and takes one unique extension 
number.
+SINGA has reserved enough extension numbers, e.g., starting from 101 to 1000.
+
+    # job.proto of SINGA
+    LayerProto {
+      ...
+      extensions 101 to 1000;
+    }
+
+With user.proto defined, users can use
+[protoc](https://developers.google.com/protocol-buffers/) to generate the 
`user.pb.cc`
+and `user.pb.h` files.  In users' code, the extension fields can be accessed 
via,
+
+    auto conf = layer_proto_.GetExtension(foo_conf);
+    int a = conf.a();
+
+When defining configurations of the new layer (in job.conf), users should use
+`user_type` for its layer type instead of `type`. In addition, `foo_conf`
+should be enclosed in brackets.
+
+    layer {
+      name: "foo"
+      user_type: "kFooLayer"  # Note user_type of user-defined layers is string
+      [singa.foo_conf] {      # Note there is a pair of [] for extension fields
+        a: 10
+      }
+    }
+
+#### New Layer subclass declaration
+
+The new layer subclass can be implemented like the built-in layer subclasses.
+
+    class FooLayer : public Layer {
+     public:
+      void Setup(const LayerProto& proto, int npartitions = 1) override;
+      void ComputeFeature(Phase phase, Metric* perf) override;
+      void ComputeGradient(Phase phase) override;
+
+     private:
+      //  members
+    };
+
+Users must override the two virtual functions to be called by the
+`TrainOneBatch` for either BP or CD algorithm. Typically, the `Setup` function
+will also be overridden to initialize some members. The user configured fields
+can be accessed through `layer_proto_` as shown in the above paragraphs.
+
+#### New Layer subclass registration
+
+The newly defined layer should be registered in 
[main.cc](http://singa.incubator.apache.org/docs/programming-guide) by adding
+
+    driver.RegisterLayer<FooLayer>("kFooLayer"); // "kFooLayer" should be 
matched to layer configurations in job.conf.
 
-### Split Layer  
-The Split Layer can seperate the input blob into several output blobs. It is 
used to the situation which one input blob should be input to several other 
output blobs.
\ No newline at end of file
+After that, the
+[NeuralNet](http://singa.incubator.apache.org/docs/neural-net) can create 
instances of the new Layer subclass.


Reply via email to