svn commit: r964016 [2/5] - in /websites/staging/singa/trunk/content: ./ docs/

buildbot Wed, 02 Sep 2015 03:32:47 -0700

Modified: websites/staging/singa/trunk/content/docs/layer.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/layer.html (original)
+++ websites/staging/singa/trunk/content/docs/layer.html Wed Sep  2 10:31:57 
2015
@@ -1,15 +1,15 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
-    <title>Apache SINGA &#x2013; Layers Instruction</title>
+    <title>Apache SINGA &#x2013; Layers</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
     <link rel="stylesheet" href="../css/site.css" />
     <link rel="stylesheet" href="../css/print.css" media="print" />
@@ -189,7 +189,7 @@
         Apache SINGA</a>
                     <span class="divider">/</span>
       </li>
-        <li class="active ">Layers Instruction</li>
+        <li class="active ">Layers</li>
         
                 
                     
@@ -423,331 +423,395 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            <h1>Layers Instruction</h1>
+            <h1>Layers</h1>
+<p>Layer is a core abstraction in SINGA. It performs a variety of feature 
transformations for extracting high-level features, e.g., loading raw features, 
parsing RGB values, doing convolution transformation, etc.</p>
+<p>The <i>Basic user guide</i> section introduces the configuration of a 
built-in layer. <i>Advanced user guide</i> explains how to extend the base 
Layer class to implement users&#x2019; functions.</p>
+<div class="section">
+<h2><a name="Basic_user_guide"></a>Basic user guide</h2>
+<div class="section">
+<h3><a name="Layer_configuration"></a>Layer configuration</h3>
+<p>The configurations of three layers from the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/mlp";>MLP example</a> is shown 
below,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">layer {
+  name: &quot;data&quot;
+  type: kShardData
+  sharddata_conf { }
+  exclude: kTest
+  partition_dim : 0
+}
+layer{
+  name: &quot;mnist&quot;
+  type: kMnist
+  srclayers: &quot;data&quot;
+  mnist_conf { }
+}
+layer{
+  name: &quot;fc1&quot;
+  type: kInnerProduct
+  srclayers: &quot;mnist&quot;
+  innerproduct_conf{ }
+  param{ }
+  param{ }
+}
+</pre></div></div>
+<p>There are some common fields for all kinds of layers:</p>
+
+<ul>
+  
+<li><tt>name</tt>: a string used to differentiate two layers.</li>
+  
+<li><tt>type</tt>: an integer used for identifying a Layer subclass. The types 
of built-in  layers are listed in LayerType (defined in job.proto).  For 
user-defined layer subclasses, <tt>user_type</tt> of string should be used 
instead of <tt>type</tt>.  The detail is explained in the <a 
href="#newlayer">last section</a> of this page.</li>
+  
+<li><tt>srclayers</tt>: one or more layer names, for identifying the source 
layers.  In SINGA, all connections are <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net";>converted</a> to 
directed connections.</li>
+  
+<li><tt>exclude</tt>: an enumerate value of type <a href="">Phase</a>, can be 
{kTest, kValidation,  kTrain}. It is used to filter this layer when creating 
the  <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net";>NeuralNet</a> for the 
excluding phase. E.g.,  the &#x201c;data&#x201d; layer would be filtered when 
creating the NeuralNet instance for test phase.</li>
+  
+<li><tt>param</tt>: configuration for a <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/param";>Param</a> instance.  There 
can be multiple Param objects in one layer.</li>
+  
+<li><tt>partition_dim</tt>: integer value indicating the partition dimension 
of this  layer. -1 (the default value) for no partitioning, 0 for partitioning 
on batch dimension, 1 for  partitioning on feature dimension. It is used by  <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net";>CreateGraph</a> for 
partitioning the neural net.</li>
+</ul>
+<p>Different layers may have different configurations. These configurations 
are defined in <tt>&lt;type&gt;_conf</tt>. E.g., the &#x201c;data&#x201d; layer 
has <tt>sharddata_conf</tt> and &#x201c;fc1&#x201d; layer has 
<tt>innerproduct_conf</tt>. The subsequent sections explain the functionality 
of each built-in layer and how to configure it,</p></div>
 <div class="section">
+<h3><a name="Built-in_Layer_subclasses"></a>Built-in Layer subclasses</h3>
+<p>SINGA has provided many built-in layers, which can be used directly to 
create neural nets. These layers are categorized according to their 
functionalities,</p>
+
+<ul>
+  
+<li>Data layers for loading records (e.g., images) from [disk], HDFS or 
network into memory.</li>
+  
+<li>Parser layers for parsing features, labels, etc. from records, into <a 
class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1Blob.html";>Blob</a>.</li>
+  
+<li>Neuron layers for feature transformation, e.g., <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1ConvolutionLayer.html";>convolution</a>,
 <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1PoolingLayer.html";>pooling</a>,
 dropout, etc.</li>
+  
+<li>Loss layers for measuring the training objective loss, e.g., [cross 
entropy-loss] or [Euclidean loss].</li>
+  
+<li>Output layers for outputting the prediction results (e.g., probabilities 
of each category) onto disk or network.</li>
+  
+<li>Connection layers for connecting layers when the neural net is 
partitioned.</li>
+</ul>
+<div class="section">
+<h4><a name="Data_Layers"></a>Data Layers</h4>
+<p>Data layers load training/testing data and convert them into <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/data";>Record</a>s, which are 
parsed by parser layers. The data source can be disk file, HDFS, database or 
network.</p>
 <div class="section">
-<h3><a name="ShardData_Layer"></a>ShardData Layer</h3>
-<p>ShardData layer is used to read data from disk etc.</p>
+<h5><a name="ShardDataLayer"></a>ShardDataLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1ShardDataLayer.html";>ShardDataLayer</a>
 is used to read data from disk file. The file should be created using <a 
class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1DataShard.html";>DataShard</a>
 class. With the data file prepared, users configure the layer as</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer   
-{
-    name:&quot;data&quot;
-    type:&quot;kShardData&quot;
-    data_param
-    {
-       path:&quot;Shard_File_Path&quot;
-       batchsize:int
-    }
-    exclude:kTrain|kValidation|kTest|kPositive|kNegative
+<div class="source"><pre class="prettyprint">type: kShardData
+sharddata_conf {
+  path: &quot;path to data shard folder&quot;
+  batchsize: int
+  random_skip: int
 }
-</pre></div></div></div>
+</pre></div></div>
+<p><tt>batchsize</tt> specifies the number of records to be trained for one 
mini-batch. The first <tt>rand() % random_skip</tt> <tt>Record</tt>s will be 
skipped at the first iteration. This is to enforce that different workers work 
on different Records.</p></div>
 <div class="section">
-<h3><a name="Label_Layer"></a>Label Layer</h3>
-<p>Label layer is used to extract the label information from training data. 
The label information will be used in the loss layer to calculate the 
gradient.</p>
+<h5><a name="LMDBDataLayer"></a>LMDBDataLayer</h5>
+<p>[LMDBDataLayer] is similar to ShardDataLayer, except that the Records are 
loaded from LMDB.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;label&quot;
-    type:&quot;kLabel&quot;
-    srclayers:&quot;data&quot;
+<div class="source"><pre class="prettyprint">type: kLMDBData
+lmdbdata_conf {
+  path: &quot;path to LMDB folder&quot;
+  batchsize: int
+  random_skip: int
+}
+</pre></div></div></div></div>
+<div class="section">
+<h4><a name="Parser_Layers"></a>Parser Layers</h4>
+<p>Parser layers get a vector of Records from data layers and parse features 
into a Blob.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">virtual void ParseRecords(Phase 
phase, const vector&lt;Record&gt;&amp; records, Blob&lt;float&gt;* blob) = 0;
+</pre></div></div>
+<div class="section">
+<h5><a name="LabelLayer"></a>LabelLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1LabelLayer.html";>LabelLayer</a>
 is used to parse a single label from each Record. Consequently, it will put 
$b$ (mini-batch size) values into the Blob. It has no specific configuration 
fields.</p></div>
+<div class="section">
+<h5><a name="MnistImageLayer"></a>MnistImageLayer</h5>
+<p>[MnistImageLayer] parses the pixel values of each image from the MNIST 
dataset. The pixel values may be normalized as <tt>x/norm_a - norm_b</tt>. For 
example, if <tt>norm_a</tt> is set to 255 and <tt>norm_b</tt> is set to 0, then 
every pixel will be normalized into [0, 1].</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">type: kMnistImage
+mnistimage_conf {
+  norm_a: float
+  norm_b: float
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="Convolution_Layer"></a>Convolution Layer</h3>
-<p>Convolution layer is a basic layer used in constitutional neural net. It is 
used to extract local feature following some local patterns from slide windows 
in the image.</p>
+<h5><a name="RGBImageLayer"></a>RGBImageLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1RGBImageLayer.html";>RGBImageLayer</a>
 parses the RGB values of one image from each Record. It may also apply some 
transformations, e.g., cropping, mirroring operations. If the <tt>meanfile</tt> 
is specified, it should point to a path that contains one Record for the mean 
of each pixel over all training images.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;Conv_Number&quot;
-    type:&quot;kConvolution&quot;
-    srclayers:&quot;Src_Layer_Name&quot;
-    convolution_param
-    {
-       num_filters:int
-       //the count of the applied filters
-       kernel:int
-       //convolution kernel size
-       stride:int
-       //the distance between the successive filters
-       pad:int
-       //pad the images with a given int number of pixels border of zeros
-    }
-    param
-    {
-       name:&quot;weight&quot;
-       
init_method:kGaussian|kConstant:kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-       /*use specific param of each init methods*/
-       learning_rate_multiplier:float
-    }
-    param
-    {
-       name:&quot;bias&quot;
-       
init_method:kConstant|kGaussian|kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-       /**use specific param of each init methods**/
-       learning_rate_multiplier:float
-    }
-    //kGaussian: sample Gaussian with std and mean
-    //kUniform: uniform sampling between low and high
-    //kPretrained: from Toronto Convnet, let a=1/sqrt(fan_in),w*=a after 
generating from Gaussian distribution
-    //kGaussianSqrtFanIn: from Toronto Convnet, rectified linear activation, 
-       //let a=sqrt(3)/sqrt(fan_in),range is [-a,+a].
-       //no need to set value=sqrt(3),the program will multiply it
-    //kUniformSqrtFanIn: from Theano MLP tutorial, let 
a=1/sqrt(fan_in+fan_out).
-       //for tanh activation, range is [-6a,+6a], for sigmoid activation.
-       // range is [-24a,+24a],put the scale factor to value field
-    //For Constant Init, use value:float
-    //For Gaussian Init, use mean:float, std:float
-    //For Uniform Init, use low:float, high:float
-}
-</pre></div></div>
-<p>Input:n * c_i * h_i * w_i</p>
-<p>Output:n * c_o * h_o * w_o&#xff0c;h_o = (h_i + 2 * pad_h - kernel_h) 
/stride_h + 1</p></div>
-<div class="section">
-<h3><a name="Dropout_Layer"></a>Dropout Layer</h3>
-<p>Dropout Layer is a layer that randomly dropout some inputs. This scheme 
helps deep learning model away from over-fitting.</p></div>
-<div class="section">
-<h3><a name="InnerProduct_Layer"></a>InnerProduct Layer</h3>
-<p>InnerProduct Layer is a fully connected layer which is the basic element in 
feed forward neural network. It will use the lower layer as a input vector V 
and output a vector H by doing the following matrix-vector multiplication:</p>
-<p>H = W*V + B // W and B are its weight and bias parameter</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;IP_Number&quot;
-    type:&quot;kInnerProduct&quot;
-    srclayers:&quot;Src_Layer_Name&quot;
-    inner_product_param
-    {
-       num_output:int
-       //The number of the filters
-    }
-    param
-    {
-       name:&quot;weight&quot;
-       
init_method:kGaussian|kConstant:kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
-       std:float
-       //
-       learning_rate_multiplier:float
-       //          
-       weight_decay_multiplier:int
-       //         
-       /*low:float,high:float*/
-       //
-    }
-    param
-    {
-       name:&quot;bias&quot;     
-       
init_method:kConstant|kGaussian|kUniform|kPretrained|kGaussianSqrtFanIn|kUniformSqrtFanIn|kUniformSqrtFanInOut
          
-       learning_rate_mulitiplier:float
-       //          
-       weight_decay_multiplier:int
-       //
-       value:int
-       //          
-       /*low:float,high:float*/
-       //
-    }
+<div class="source"><pre class="prettyprint">type: kRGBImage
+rgbimage_conf {
+  scale: float
+  cropsize: int  # cropping each image to keep the central part with this size
+  mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+  meanfile: &quot;Image_Mean_File_Path&quot;
 }
+</pre></div></div>
+<p>{% comment %}</p></div></div>
+<div class="section">
+<h4><a name="PrefetchLayer"></a>PrefetchLayer</h4>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1PrefetchLayer.html";>PrefetchLayer</a>
 embeds data layers and parser layers to do data prefeching. It will launch a 
thread to call the data layers and parser layers to load and extract features. 
It ensures that the I/O task and computation task can work simultaneously. One 
example PrefetchLayer configuration is,</p>
 
-Input:n * c_i * h_i * w_i
-Output:n * c_o * 1 *1
-</pre></div></div></div>
+<div class="source">
+<div class="source"><pre class="prettyprint">layer {
+  name: &quot;prefetch&quot;
+  type: kPrefetch
+  sublayers {
+    name: &quot;data&quot;
+    type: kShardData
+    sharddata_conf { }
+  }
+  sublayers {
+    name: &quot;rgb&quot;
+    type: kRGBImage
+    srclayers:&quot;data&quot;
+    rgbimage_conf { }
+  }
+  sublayers {
+    name: &quot;label&quot;
+    type: kLabel
+    srclayers: &quot;data&quot;
+  }
+  exclude:kTest
+}
+</pre></div></div>
+<p>The layers on top of the PrefetchLayer should use the name of the embedded 
layers as their source layers. For example, the &#x201c;rgb&#x201d; and 
&#x201c;label&#x201d; should be configured to the <tt>srclayers</tt> of other 
layers.</p>
+<p>{% endcomment %}</p></div>
+<div class="section">
+<h4><a name="Neuron_Layers"></a>Neuron Layers</h4>
+<p>Neuron layers conduct feature transformations.</p>
 <div class="section">
-<h3><a name="LMDBData_Layer"></a>LMDBData Layer</h3>
-<p>This is a data input layer, the data will be provided by the LMDB.</p>
+<h5><a name="ConvolutionLayer"></a>ConvolutionLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1ConvolutionLayer.html";>ConvolutionLayer</a>
 conducts convolution transformation.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;data&quot;
-    type:&quot;kLMDBDate&quot;
-    data_param
-    {
-       path:&quot;LMDB_FILE_PATH&quot;
-       batchsize:int
-        //batchsize means the quantity of the input disposable  
-    }
-    exclude:kTrain|kValidation|kTest|kPositive|kNegative
+<div class="source"><pre class="prettyprint">type: kConvolution
+convolution_conf {
+  num_filters: int
+  kernel: int
+  stride: int
+  pad: int
 }
-</pre></div></div></div>
+param { } # weight/filter matrix
+param { } # bias vector
+</pre></div></div>
+<p>The int value <tt>num_filters</tt> stands for the count of the applied 
filters; the int value <tt>kernel</tt> stands for the convolution kernel size 
(equal width and height); the int value <tt>stride</tt> stands for the distance 
between the successive filters; the int value <tt>pad</tt> pads each with a 
given int number of pixels border of zeros.</p></div>
 <div class="section">
-<h3><a name="LRN_Layer"></a>LRN Layer</h3>
-<p>Local Response Normalization normalizes over the local input areas. It 
provides two modes: WITHIN_CHANNEL and ACROSS_CHANNELS. The local response 
normalization layer performs a kind of &#x201c;lateral inhibition&#x201d; by 
normalizing over local input regions. In ACROSS_CHANNELS mode, the local 
regions extend across nearby channels, but have no spatial extent (i.e., they 
have shape local_size x 1 x 1). In WITHIN_CHANNEL mode, the local regions 
extend spatially, but are in separate channels (i.e., they have shape 1 x 
local_size x local_size). Each input value is divided by ![](<a 
class="externalLink" 
href="http://i.imgur.com/GgTjjtR.png)">http://i.imgur.com/GgTjjtR.png)</a>, 
where n is the size of each local region, and the sum is taken over the region 
centered at that value (zero padding is added where necessary).</p>
+<h5><a name="InnerProductLayer"></a>InnerProductLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1InnerProductLayer.html";>InnerProductLayer</a>
 is fully connected with its (single) source layer. Typically, it has two 
parameter fields, one for weight matrix, and the other for bias vector. It 
rotates the feature of the source layer (by multiplying with weight matrix) and 
shifts it (by adding the bias vector).</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;Norm_Number&quot;
-    type:&quot;kLRN&quot;
-    lrn_param
-    {
-       norm_region:WITHIN_CHANNEL|ACROSS_CHANNELS
-       local_size:int
-       //for WITHIN_CHANNEL, it means the side length of the space region 
which will be summed up
-       //for ACROSS_CHANNELS, it means the quantity of the adjoining channels 
which will be summed up
-       alpha:5e-05
-       beta:float
-    }
-    srclayers:&quot;Src_Layer_Name&quot;
+<div class="source"><pre class="prettyprint">type: kInnerProduct
+innerproduct_conf {
+  num_output: int
 }
+param { } # weight matrix
+param { } # bias vector
 </pre></div></div></div>
 <div class="section">
-<h3><a name="MnistImage_Layer"></a>MnistImage Layer</h3>
-<p>MnistImage is a pre-processing layer for MNIST dataset.</p>
+<h5><a name="PoolingLayer"></a>PoolingLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1PoolingLayer.html";>PoolingLayer</a>
 is used to do a normalization (or averaging or sampling) of the feature 
vectors from the source layer.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;mnist&quot;
-    type:&quot;kMnistImage&quot;
-    srclayers:&quot;data&quot;
-    mnist_param
-    {
-       sigma:int
-       alpha:int
-       gamma:int
-       kernel:int
-       elastic_freq:int
-       beta:int
-       resize:int
-       norm_a:int
-    }
+<div class="source"><pre class="prettyprint">type: kPooling
+pooling_conf {
+  pool: AVE|MAX // Choose whether use the Average Pooling or Max Pooling
+  kernel: int   // size of the kernel filter
+  pad: int      // the padding size
+  stride: int   // the step length of the filter
 }
-</pre></div></div></div>
+</pre></div></div>
+<p>The pooling layer has two methods: Average Pooling and Max Pooling. Use the 
enum AVE and MAX to choose the method.</p>
+
+<ul>
+  
+<li>Max Pooling selects the max value for each filtering area as a point of 
the  result feature blob.</li>
+  
+<li>Average Pooling averages all values for each filtering area at a point of 
the result feature blob.</li>
+</ul></div>
+<div class="section">
+<h5><a name="ReLULayer"></a>ReLULayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1ReLULayer.html";>ReLuLayer</a>
 has rectified linear neurons, which conducts the following transformation, 
<tt>f(x) = Max(0, x)</tt>. It has no specific configuration fields.</p></div>
 <div class="section">
-<h3><a name="Pooling_Layer"></a>Pooling Layer</h3>
-<p>Max Pooling uses a specific scanning window to find the max value.<br 
/>Average Pooling scans all the values in the window to calculate the average 
value.</p>
+<h5><a name="TanhLayer"></a>TanhLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1TanhLayer.html";>TanhLayer</a>
 uses the tanh as activation function, i.e., <tt>f(x)=tanh(x)</tt>. It has no 
specific configuration fields.</p></div>
+<div class="section">
+<h5><a name="SigmoidLayer"></a>SigmoidLayer</h5>
+<p>[SigmoidLayer] uses the sigmoid (or logistic) as activation function, i.e., 
<tt>f(x)=sigmoid(x)</tt>. It has no specific configuration fields.</p></div>
+<div class="section">
+<h5><a name="Dropout_Layer"></a>Dropout Layer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/asssinga_1_1DropoutLayer.html";>DropoutLayer</a>
 is a layer that randomly dropouts some inputs. This scheme helps deep learning 
model away from over-fitting.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;Pool_Number&quot;
-    type:&quot;kPooling&quot;
-    srclayers:&quot;Src_Layer_Name&quot;
-    pooling_param
-    {
-       pool:AVE|MAX
-       //Choose whether use the Average Pooling or Max Pooling
-       kernel:int
-       //size of the kernel filter
-       stride:int
-       //the step length of the filter
-    }
+<div class="source"><pre class="prettyprint">type: kDropout
+dropout_conf {
+  dropout_ratio: float # dropout probability
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="ReLU_Layer"></a>ReLU Layer</h3>
-<p>The rectifier function is an activation function f(x) = Max(0, x) which can 
be used by neurons just like any other activation function, a node using the 
rectifier activation function is called a ReLu node. The main reason that it is 
used is because of how efficiently it can be computed compared to more 
conventional activation functions like the sigmoid and hyperbolic tangent, 
without making a significant difference to generalization accuracy. The 
rectifier activation function is used instead of a linear activation function 
to add non linearity to the network, otherwise the network would only ever be 
able to compute a linear function.</p>
+<h5><a name="LRNLayer"></a>LRNLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1LRNLayer.html";>LRNLayer</a>,
 (Local Response Normalization), normalizes over the channels.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;Relu_Number&quot;
-    type:&quot;kReLU&quot;
-    srclayers:&quot;Src_Layer_Name&quot;
+<div class="source"><pre class="prettyprint">type: kLRN
+lrn_conf {
+  local_size: int
+  alpha: float  // scaling parameter
+  beta: float   // exponential number
 }
-</pre></div></div></div>
+</pre></div></div>
+<p><tt>local_size</tt> specifies the quantity of the adjoining channels which 
will be summed up. {% comment %}  For <tt>WITHIN_CHANNEL</tt>, it means the 
side length of the space region which will be summed up. {% endcomment 
%}</p></div></div>
+<div class="section">
+<h4><a name="Loss_Layers"></a>Loss Layers</h4>
+<p>Loss layers measures the objective training loss.</p>
 <div class="section">
-<h3><a name="RGBImage_Layer"></a>RGBImage Layer</h3>
-<p>RGBImage layer is a pre-processing layer for RGB format images. </p>
+<h5><a name="SoftmaxLossLayer"></a>SoftmaxLossLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1SoftmaxLossLayer.html";>SoftmaxLossLayer</a>
 is a combination of the Softmax transformation and Cross-Entropy loss. It 
applies Softmax firstly to get a prediction probability for each output unit 
(neuron) and compute the cross-entropy against the ground truth. It is 
generally used as the final layer to generate labels for classification 
tasks.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;rgb&quot;
-    type:&quot;kRGBImage&quot;
-    srclayers:&quot;data&quot;
-    rgbimage_param
-    {
-       meanfile:&quot;Image_Mean_File_Path&quot;
-    }
+<div class="source"><pre class="prettyprint">type: kSoftmaxLoss
+softmaxloss_conf {
+  topk: int
+}
+</pre></div></div>
+<p>The configuration field <tt>topk</tt> is for selecting the labels with 
<tt>topk</tt> probabilities as the prediction results. It is tedious for users 
to view the prediction probability of every label.</p></div></div>
+<div class="section">
+<h4><a name="Other_Layers"></a>Other Layers</h4>
+<div class="section">
+<h5><a name="ConcateLayer"></a>ConcateLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1ConcateLayer.html";>ConcateLayer</a>
 connects more than one source layers to concatenate their feature blob along 
given dimension.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">type: kConcate
+concate_conf {
+  concate_dim: int  // define the dimension
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="Tanh_Layer"></a>Tanh Layer</h3>
-<p>Tanh uses the tanh as activation function. It transforms the input into 
range [-1, 1] using Tanh function. </p>
+<h5><a name="SliceLayer"></a>SliceLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1SliceLayer.html";>SliceLayer</a>
 connects to more than one destination layers to slice its feature blob along 
given dimension.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;Tanh_Number&quot;
-    type:&quot;kTanh&quot;
-    srclayer:&quot;Src_Layer_Name&quot;
+<div class="source"><pre class="prettyprint">type: kSlice
+slice_conf {
+  slice_dim: int
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="SoftmaxLoss_Layer"></a>SoftmaxLoss Layer</h3>
-<p>Softmax Loss Layer is the implementation of multi-class softmax loss 
function. It is generally used as the final layer to generate labels for 
classification tasks.</p>
+<h5><a name="SplitLayer"></a>SplitLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1SplitLayer.html";>SplitLayer</a>
 connects to more than one destination layers to replicate its feature blob.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;loss&quot;
-    type:&quot;kSoftmaxLoss&quot;
-    softmaxloss_param
-    {
-       topk:int
-    }
-    srclayers:&quot;Src_Layer_Name&quot;
-    srclayers:&quot;Src_Layer_Name&quot;
+<div class="source"><pre class="prettyprint">type: kSplit
+split_conf {
+  num_splits: int
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="BridgeSrc__BridgeDst_Layer"></a>BridgeSrc &amp; BridgeDst 
Layer</h3>
-<p>BridgeSrc &amp; BridgeDst Layer are utility layers implementing logics of 
model partition. It can be used as a lock for synchronization, a transformation 
storage of different type of model partition and etc.</p></div>
+<h5><a name="BridgeSrcLayer__BridgeDstLayer"></a>BridgeSrcLayer &amp; 
BridgeDstLayer</h5>
+<p><a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1BridgeSrcLayer.html";>BridgeSrcLayer</a>
 &amp; <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1BridgeDstLayer.html";>BridgeDstLayer</a>
 are utility layers assisting data (e.g., feature or gradient) transferring due 
to neural net partitioning. These two layers are added implicitly. Users 
typically do not need to configure them in their neural net 
configuration.</p></div></div></div></div>
 <div class="section">
-<h3><a name="Concate_Layer"></a>Concate Layer</h3>
-<p>Concat Layer is used to concatenate the last dimension (namely, 
num_feature) of the output of two nodes. It is usually used along with fully 
connected layer.</p></div>
+<h2><a name="Advanced_user_guide"></a>Advanced user guide</h2>
+<p>The base Layer class is introduced in this section, followed by how to 
implement a new Layer subclass.</p>
 <div class="section">
-<h3><a name="Parser_Layer"></a>Parser Layer</h3>
-<p>Parser Layer will parse the input records into Blobs. </p></div>
+<h3><a name="Base_Layer_class"></a>Base Layer class</h3>
 <div class="section">
-<h3><a name="Prefetch_Layer"></a>Prefetch Layer</h3>
-<p>Prefetch Layer is used to pre-fetch data from disk. It ensures that the I/O 
task and computation/communication task can work simultaneously. </p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">layer
-{
-    name:&quot;prefetch&quot;
-    type:&quot;kPrefetch&quot;
-    sublayers
-    {
-       name:&quot;data&quot;
-       type:&quot;kShardData&quot;
-       data_param
-       {
-         path:&quot;Shard_File_Path&quot;
-         batchsize:int
-       }
-    }
-    sublayers
-    {
-       name:&quot;rgb&quot;
-       type:&quot;kRGBImage&quot;
-       srclayers:&quot;data&quot;
-       rgbimage_param
-       {
-         meanfile:&quot;Image_Mean_File_Path&quot;
-       }
-    }
-    sublayers
-    {
-       name:&quot;label&quot;
-       type:&quot;kLabel&quot;
-       srclayers:&quot;data&quot;
-    }
-    exclude:kTrain|kValidation|kTest|kPositive|kNegative
+<h4><a name="Members"></a>Members</h4>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">LayerProto layer_proto_;
+Blob&lt;float&gt; data_, grad_;
+vector&lt;Layer*&gt; srclayers_, dstlayers_;
+</pre></div></div>
+<p>The base layer class keeps the user configuration in <tt>layer_proto_</tt>. 
Source layers and destination layers are stored in <tt>srclayers_</tt> and 
<tt>dstlayers_</tt>, respectively. Almost all layers has $b$ (mini-batch size) 
feature vectors, which are stored in the <tt>data_</tt> <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1Blob.html";>Blob</a> 
(A Blob is a chunk of memory space, proposed in <a class="externalLink" 
href="http://caffe.berkeleyvision.org/";>Caffe</a>). There are layers without 
feature vectors; instead, they use other layers&#x2019; feature vectors. In 
this case, the <tt>data_</tt> field is not used. The <tt>grad_</tt> Blob is for 
storing the gradients of the objective loss w.r.t. the <tt>data_</tt> Blob. It 
is necessary in <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1BPWorker.html";>BP 
algorithm</a>, hence we put it as a member of the base class. For <a 
class="externalLink" href="http://sin
 ga.incubator.apache.org/api/classsinga_1_1CDWorker.html">CD algorithm</a>, the 
<tt>grad_</tt> field is not used; instead, the layer from RBM may have a Blob 
for the positive phase feature and a Blob for the negative phase feature. For a 
recurrent layer in RNN, the feature blob contains one vector per internal 
layer.</p>
+<p>If a layer has parameters, these parameters are declared using type <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/param";>Param</a>. Since some 
layers do not have parameters, we do not declare any <tt>Param</tt> in the base 
layer class.</p></div>
+<div class="section">
+<h4><a name="Functions"></a>Functions</h4>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">virtual void Setup(const 
LayerProto&amp; proto, int npartitions = 1);
+virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+virtual void ComputeGradient(Phase phase) = 0;
+</pre></div></div>
+<p>The <tt>Setup</tt> function reads user configuration, i.e. <tt>proto</tt>, 
and information from source layers, e.g., mini-batch size, to set the shape of 
the <tt>data_</tt> (and <tt>grad_</tt>) field as well as some other layer 
specific fields. If <tt>npartitions</tt> is larger than 1, then users need to 
reduce the sizes of <tt>data_</tt>, <tt>grad_</tt> Blobs or Param objects. For 
example, if the <tt>partition_dim=0</tt> and there is no source layer, e.g., 
this layer is a (bottom) data layer, then its <tt>data_</tt> and <tt>grad_</tt> 
Blob should have <tt>b/npartitions</tt> feature vectors; If the source layer is 
also partitioned on dimension 0, then this layer should have the same number of 
feature vectors as the source layer. More complex partition cases are discussed 
in <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net/#neural-net-partitioning";>Neural
 net partitioning</a>. Typically, the Setup function just set the shapes of 
<tt>data_</tt> Blobs 
 and Param objects. Memory will not be allocated until computation over the 
data structure happens.</p>
+<p>The <tt>ComputeFeature</tt> function evaluates the feature blob by 
transforming (e.g. convolution and pooling) features from the source layers. 
<tt>ComputeGradient</tt> computes the gradients of parameters associated with 
this layer. These two functions are invoked by the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/train-one-batch";>TrainOneBatch</a> 
function during training. Hence, they should be consistent with the 
<tt>TrainOneBatch</tt> function. Particularly, for feed-forward and RNN models, 
they are trained using <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/train-one-batch/#back-propagation";>BP
 algorithm</a>, which requires each layer&#x2019;s <tt>ComputeFeature</tt> 
function to compute <tt>data_</tt> based on source layers, and requires each 
layer&#x2019;s <tt>ComputeGradient</tt> to compute gradients of parameters and 
source layers&#x2019; <tt>grad_</tt>. For energy models, e.g., RBM, they are 
trained by <a class="externalLin
 k" 
href="http://singa.incubator.apache.org/docs/train-one-batch/#contrastive-divergence";>CD
 algorithm</a>, which requires each layer&#x2019;s <tt>ComputeFeature</tt> 
function to compute the feature vectors for the positive phase or negative 
phase depending on the <tt>phase</tt> argument, and requires the 
<tt>ComputeGradient</tt> function to only compute parameter gradients. For some 
layers, e.g., loss layer or output layer, they can put the loss or prediction 
result into the <tt>metric</tt> argument, which will be averaged and displayed 
periodically.</p></div></div>
+<div class="section">
+<h3><a name="Implementing_a_new_Layer_subclass"></a>Implementing a new Layer 
subclass</h3>
+<p>Users can extend the base layer class to implement their own feature 
transformation logics as long as the two virtual functions are overridden to be 
consistent with the <tt>TrainOneBatch</tt> function. The <tt>Setup</tt> 
function may also be overridden to read specific layer configuration.</p>
+<div class="section">
+<h4><a name="Layer_specific_protocol_message"></a>Layer specific protocol 
message</h4>
+<p>To implement a new layer, the first step is to define the layer specific 
configuration. Suppose the new layer is <tt>FooLayer</tt>, the layer specific 
google protocol message <tt>FooLayerProto</tt> should be defined as</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in user.proto
+package singa
+import &quot;job.proto&quot;
+message FooLayerProto {
+  optional int32 a = 1;  // specific fields to the FooLayer
+}
+</pre></div></div>
+<p>In addition, users need to extend the original <tt>LayerProto</tt> (defined 
in job.proto of SINGA) to include the <tt>foo_conf</tt> as follows.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">extend LayerProto {
+  optional FooLayerProto foo_conf = 101;  // unique field id, reserved for 
extensions
+}
+</pre></div></div>
+<p>If there are multiple new layers, then each layer that has specific 
configurations would have a <tt>&lt;type&gt;_conf</tt> field and takes one 
unique extension number. SINGA has reserved enough extension numbers, e.g., 
starting from 101 to 1000.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># job.proto of SINGA
+LayerProto {
+  ...
+  extensions 101 to 1000;
+}
+</pre></div></div>
+<p>With user.proto defined, users can use <a class="externalLink" 
href="https://developers.google.com/protocol-buffers/";>protoc</a> to generate 
the <tt>user.pb.cc</tt> and <tt>user.pb.h</tt> files. In users&#x2019; code, 
the extension fields can be accessed via,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">auto conf = 
layer_proto_.GetExtension(foo_conf);
+int a = conf.a();
+</pre></div></div>
+<p>When defining configurations of the new layer (in job.conf), users should 
use <tt>user_type</tt> for its layer type instead of <tt>type</tt>. In 
addition, <tt>foo_conf</tt> should be enclosed in brackets.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">layer {
+  name: &quot;foo&quot;
+  user_type: &quot;kFooLayer&quot;  # Note user_type of user-defined layers is 
string
+  [singa.foo_conf] {      # Note there is a pair of [] for extension fields
+    a: 10
+  }
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="Slice_Layer"></a>Slice Layer</h3>
-<p>The Slice layer is a utility layer that slices an input layer to multiple 
output layers along a given dimension (currently num or channel only) with 
given slice indices.</p></div>
+<h4><a name="New_Layer_subclass_declaration"></a>New Layer subclass 
declaration</h4>
+<p>The new layer subclass can be implemented like the built-in layer 
subclasses.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">class FooLayer : public Layer {
+ public:
+  void Setup(const LayerProto&amp; proto, int npartitions = 1) override;
+  void ComputeFeature(Phase phase, Metric* perf) override;
+  void ComputeGradient(Phase phase) override;
+
+ private:
+  //  members
+};
+</pre></div></div>
+<p>Users must override the two virtual functions to be called by the 
<tt>TrainOneBatch</tt> for either BP or CD algorithm. Typically, the 
<tt>Setup</tt> function will also be overridden to initialize some members. The 
user configured fields can be accessed through <tt>layer_proto_</tt> as shown 
in the above paragraphs.</p></div>
 <div class="section">
-<h3><a name="Split_Layer"></a>Split Layer</h3>
-<p>The Split Layer can seperate the input blob into several output blobs. It 
is used to the situation which one input blob should be input to several other 
output blobs.</p></div></div>
+<h4><a name="New_Layer_subclass_registration"></a>New Layer subclass 
registration</h4>
+<p>The newly defined layer should be registered in <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/programming-guide";>main.cc</a> by 
adding</p>
+
+<div class="source">
+<div class="source"><pre 
class="prettyprint">driver.RegisterLayer&lt;FooLayer&gt;(&quot;kFooLayer&quot;);
 // &quot;kFooLayer&quot; should be matched to layer configurations in job.conf.
+</pre></div></div>
+<p>After that, the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net";>NeuralNet</a> can 
create instances of the new Layer subclass.</p></div></div></div>
                   </div>
             </div>
           </div>


Modified: websites/staging/singa/trunk/content/docs/lmdb.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/lmdb.html (original)
+++ websites/staging/singa/trunk/content/docs/lmdb.html Wed Sep  2 10:31:57 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; </title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />

Modified: websites/staging/singa/trunk/content/docs/mlp.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/mlp.html (original)
+++ websites/staging/singa/trunk/content/docs/mlp.html Wed Sep  2 10:31:57 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; </title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
@@ -21,7 +21,7 @@
     <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
 
     
-    <meta name="Notice" content="Licensed to the Apache Software Foundation 
(ASF) under one            or more contributor license agreements.  See the 
NOTICE file            distributed with this work for additional information    
        regarding copyright ownership.  The ASF licenses this file            
to you under the Apache License, Version 2.0 (the            
&quot;License&quot;); you may not use this file except in compliance            
with the License.  You may obtain a copy of the License at            .         
     http://www.apache.org/licenses/LICENSE-2.0            .            Unless 
required by applicable law or agreed to in writing,            software 
distributed under the License is distributed on an            &quot;AS IS&quot; 
BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY            KIND, either express 
or implied.  See the License for the            specific language governing 
permissions and limitations            under the License." />              </hea
 d>
+                  </head>
         <body class="topBarEnabled">
           
     
@@ -425,39 +425,28 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            
-<p>This example will show you how to use SINGA to train a MLP model using 
mnist dataset.</p>
-<div class="section">
-<div class="section">
-<h3><a name="Prepare_for_the_data"></a>Prepare for the data</h3>
-
-<ul>
-  
-<li>First go to the <tt>example/mnist/</tt> folder for preparing the dataset. 
There should be a makefile example called Makefile.example in the folder. Run 
the command <tt>cp Makefile.example Makefile</tt> to generate the makefile. 
Then run the command <tt>make download</tt> and <tt>make create</tt> in the 
current folder to download mnist dataset and prepare for the training and 
testing datashard.</li>
-</ul></div>
+            <p># MLP Example</p>
+<p>Multilayer perceptron (MLP) is a feed-forward artificial neural network 
model. A MLP typically consists of multiple directly connected layers, with 
each layer fully connected to the next one. In this example, we will use SINGA 
to train a <a class="externalLink" href="http://arxiv.org/abs/1003.0358";>simple 
MLP model proposed by Ciresan</a> for classifying handwritten digits from the 
<a class="externalLink" href="http://yann.lecun.com/exdb/mnist/";>MNIST 
dataset</a>.</p>
 <div class="section">
-<h3><a name="Set_model_and_cluster_configuration."></a>Set model and cluster 
configuration.</h3>
+<h2><a name="Running_instructions"></a>Running instructions</h2>
+<p>Please refer to the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/installation";>installation</a> 
page for instructions on building SINGA, and the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/quick-start";>quick start</a> for 
instructions on starting zookeeper.</p>
+<p>We have provided scripts for preparing the training and test dataset in 
<i>examples/cifar10/</i>.</p>
 
-<ul>
-  
-<li>If you just want to use the training model provided in this example, you 
can just use job.conf file in current directory. Fig. 1 gives an example of MLP 
struture. In this example, we define a neurualnet that contains 5 hidden layer. 
fc+tanh is the hidden layer(fc is for the inner product part, and tanh is for 
the non-linear activation function), and the final softmax layer is represented 
as fc+loss (inner product and softmax). For each layer, we define its name, 
input layer(s), basic configurations (e.g. number of nodes, parameter 
initialization settings). If you want to learn more about how it is configured, 
you can go to <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/model-config.html";>Model 
Configuration</a> to get details.</li>
-</ul>
+<div class="source">
+<div class="source"><pre class="prettyprint"># in examples/mnist
+$ cp Makefile.example Makefile
+$ make download
+$ make create
+</pre></div></div>
+<p>After the datasets are prepared, we start the training by</p>
 
-<div style="text-align: center">
-<img src="../images/mlp_example.png" style="width: 280px" alt="" /> <br />Fig. 
1: MLP example </img>
-</div></div>
-<div class="section">
-<h3><a name="Run_SINGA"></a>Run SINGA</h3>
+<div class="source">
+<div class="source"><pre class="prettyprint">./bin/singa-run.sh -conf 
examples/mnist/job.conf
+</pre></div></div>
+<p>After it is started, you should see output like</p>
 
-<ul>
-  
-<li>
-<p>All script of SINGA should be run in the root folder of SINGA. First you 
need to start the zookeeper service if zookeeper is not started. The command is 
<tt>./bin/zk-service start</tt>. Then you can run the command 
<tt>./bin/singa-run.sh -conf examples/mnist/job.conf</tt> to start a SINGA job 
using examples/mnist/job.conf as the job configuration. After it is started, 
you should get a screenshots like the following:</p>
-  
 <div class="source">
-<div class="source"><pre class="prettyprint">xxx@yyy:zzz/incubator-singa$ 
./bin/singa-run.sh -conf examples/mnist/job.conf
-Unique JOB_ID is 1
-Record job information to /tmp/singa-log/job-info/job-1-20150817-055231
+<div class="source"><pre class="prettyprint">Record job information to 
/tmp/singa-log/job-info/job-1-20150817-055231
 Executing : ./singa -conf /xxx/incubator-singa/examples/mnist/job.conf 
-singa_conf /xxx/incubator-singa/conf/singa.conf -singa_job 1
 E0817 07:15:09.211885 34073 cluster.cc:51] proc #0 -&gt; 192.168.5.128:49152 
(pid = 34073)
 E0817 07:15:14.972231 34114 server.cc:36] Server (group = 0, id = 0) start
@@ -477,16 +466,171 @@ E0817 07:18:52.608111 34073 trainer.cc:3
 E0817 07:19:12.168465 34073 trainer.cc:373] Train step-100, loss : 1.387759, 
accuracy : 0.721000
 E0817 07:19:31.855865 34073 trainer.cc:373] Train step-110, loss : 1.335246, 
accuracy : 0.736500
 E0817 07:19:57.327133 34073 trainer.cc:373] Test step-120, loss : 1.216652, 
accuracy : 0.769900
+</pre></div></div>
+<p>After the training of some steps (depends on the setting) or the job is 
finished, SINGA will <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/checkpoint";>checkpoint</a> the 
model parameters.</p></div>
+<div class="section">
+<h2><a name="Details"></a>Details</h2>
+<p>To train a model in SINGA, you need to prepare the datasets, and a job 
configuration which specifies the neural net structure, training algorithm (BP 
or CD), SGD update algorithm (e.g. Adagrad), number of training/test steps, 
etc.</p>
+<div class="section">
+<h3><a name="Data_preparation"></a>Data preparation</h3>
+<p>Before using SINGA, you need to write a program to pre-process the dataset 
you use to a format that SINGA can read. Please refer to the <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/data#example---mnist-dataset";>Data 
Preparation</a> to get details about preparing this MNIST dataset.</p></div>
+<div class="section">
+<h3><a name="Neural_net"></a>Neural net</h3>
+
+<div style="text-align: center">
+<img src="http://singa.incubator.apache.org/assets/image/mlp-example.png"; 
style="width: 230px" alt="" />
+<br /><b>Figure 1 - Net structure of the MLP example. </b></img>
+</div>
+<p>Figure 1 shows the structure of the simple MLP model, which is constructed 
following <a class="externalLink" 
href="http://arxiv.org/abs/1003.0358";>Ciresan&#x2019;s paper</a>. The dashed 
circle contains two layers which represent one feature transformation stage. 
There are 6 such stages in total. They sizes of the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#innerproductlayer";>InnerProductLayer</a>s
 in these circles decrease from 
2500-&gt;2000-&gt;1500-&gt;1000-&gt;500-&gt;10.</p>
+<p>Next we follow the guide in <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/neural-net";>neural net page</a> 
and <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer";>layer page</a> to write the 
neural net configuration.</p>
+
+<ul>
+  
+<li>
+<p>We configure a <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#data-layers";>data layer</a> 
to read the training/testing <tt>Records</tt> from <tt>DataShard</tt>.</p>
+  
+<div class="source">
+<div class="source"><pre class="prettyprint">layer {
+    name: &quot;data&quot;
+    type: kShardData
+    sharddata_conf {
+      path: &quot;examples/mnist/mnist_train_shard&quot;
+      batchsize: 1000
+    }
+    exclude: kTest
+  }
+
+layer {
+    name: &quot;data&quot;
+    type: kShardData
+    sharddata_conf {
+      path: &quot;examples/mnist/mnist_test_shard&quot;
+      batchsize: 1000
+    }
+    exclude: kTrain
+  }
+</pre></div></div></li>
+  
+<li>
+<p>We configure two <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#parser-layers";>parser 
layers</a> to extract the image feature and label from <tt>Records</tt>s loaded 
by the <i>data</i> layer. The <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#mnistlayer";>MnistLayer</a> 
will normalize the pixel values into [-1,1].</p>
+  
+<div class="source">
+<div class="source"><pre class="prettyprint">layer{
+    name:&quot;mnist&quot;
+    type: kMnist
+    srclayers: &quot;data&quot;
+    mnist_conf {
+      norm_a: 127.5
+      norm_b: 1
+    }
+  }
+
+layer{
+    name: &quot;label&quot;
+    type: kLabel
+    srclayers: &quot;data&quot;
+  }
+</pre></div></div></li>
+  
+<li>
+<p>All <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#innerproductlayer";>InnerProductLayer</a>s
 are configured similarly as,</p>
+  
+<div class="source">
+<div class="source"><pre class="prettyprint">layer{
+  name: &quot;fc1&quot;
+  type: kInnerProduct
+  srclayers:&quot;mnist&quot;
+  innerproduct_conf{
+    num_output: 2500
+  }
+  param{
+    name: &quot;w1&quot;
+    init {
+      type: kUniform
+      low:-0.05
+      high:0.05
+    }
+  }
+  param{
+    name: &quot;b1&quot;
+    init {
+      type : kUniform
+      low: -0.05
+      high:0.05
+    }
+  }
+}
+</pre></div></div>
+<p>with the <tt>num_output</tt> decreasing from 2500 to 10.</p></li>
+  
+<li>
+<p>All <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#tanhlayer";>TanhLayer</a> are 
configured similarly as,</p>
+  
+<div class="source">
+<div class="source"><pre class="prettyprint">layer{
+  name: &quot;tanh1&quot;
+  type: kTanh
+  tanh_conf {
+    outer_scale: 1.7159047
+    inner_scale: 0.6666667
+  }
+  srclayers:&quot;fc1&quot;
+}
 </pre></div></div></li>
 </ul>
-<p>After the training of some steps (depends on the setting) or the job is 
finished, SINGA will checkpoint the current parameter. In the next time, you 
can train (or use for your application) by loading the checkpoint. Please refer 
to <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/checkpoint.html";>Checkpoint</a> 
for the use of checkpoint.</p></div>
-<div class="section">
-<h3><a name="Build_your_own_model"></a>Build your own model</h3>
+<p>every neuron from the source layer is transformed as 
<tt>outer_scale*tanh(inner_scale* x)</tt>.</p>
 
 <ul>
   
-<li>If you want to specify you own model, then you need to decribe it in the 
job.conf file. It should contain the neurualnet structure, training 
algorithm(backforward or contrastive divergence etc.), SGD update 
algorithm(e.g. Adagrad), number of training/test steps and training/test 
frequency, and display features and etc. SINGA will read job.conf as a Google 
protobuf class <a href="../src/proto/job.proto">JobProto</a>. You can also 
refer to the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/programmer-guide.html";>Programmer 
Guide</a> to get details.</li>
-</ul></div></div>
+<li>
+<p>The final <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#softmaxloss";>Softmax loss 
layer</a> connects to LabelLayer and the last TanhLayer.</p>
+  
+<div class="source">
+<div class="source"><pre class="prettyprint">layer{
+  name: &quot;loss&quot;
+  type:kSoftmaxLoss
+  softmaxloss_conf{
+    topk:1
+  }
+  srclayers:&quot;fc6&quot;
+  srclayers:&quot;label&quot;
+}
+</pre></div></div></li>
+</ul></div>
+<div class="section">
+<h3><a name="Updater"></a>Updater</h3>
+<p>The <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/updater#updater";>normal SGD 
updater</a> is selected. The learning rate shrinks by 0.997 every 60 steps 
(i.e., one epoch).</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">updater{
+  type: kSGD
+  learning_rate{
+    base_lr: 0.001
+    type : kStep
+    step_conf{
+      change_freq: 60
+      gamma: 0.997
+    }
+  }
+}
+</pre></div></div></div>
+<div class="section">
+<h3><a name="TrainOneBatch_algorithm"></a>TrainOneBatch algorithm</h3>
+<p>The MLP model is a feed-forward model, hence [Back-propagation 
algorithm]({{ BASE_PATH}}/docs/train-one-batch#back-propagation) is 
selected.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">    alg: kBP
+</pre></div></div></div>
+<div class="section">
+<h3><a name="Cluster_setting"></a>Cluster setting</h3>
+<p>The following configuration set a single worker and server for training. <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/frameworks";>Training 
frameworks</a> page introduces configurations of a couple of distributed 
training frameworks.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">cluster {
+  nworker_groups: 1
+  nserver_groups: 1
+}
+</pre></div></div></div></div>
                   </div>
             </div>
           </div>

Modified: websites/staging/singa/trunk/content/docs/model-config.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/model-config.html (original)
+++ websites/staging/singa/trunk/content/docs/model-config.html Wed Sep  2 
10:31:57 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Model Configuration</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
@@ -423,10 +423,10 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            <div class="section">
-<h2><a name="Model_Configuration"></a>Model Configuration</h2>
+            <h1>Model Configuration</h1>
 <p>SINGA uses the stochastic gradient descent (SGD) algorithm to train 
parameters of deep learning models. For each SGD iteration, there is a <a 
href="docs/architecture.html">Worker</a> computing gradients of parameters from 
the NeuralNet and a <a href="">Updater</a> updating parameter values based on 
gradients. Hence the model configuration mainly consists these three parts. We 
will introduce the NeuralNet, Worker and Updater in the following paragraphs 
and describe the configurations for them. All model configuration is specified 
in the model.conf file in the user provided workspace folder. E.g., the <a 
class="externalLink" 
href="https://github.com/apache/incubator-singa/tree/master/examples/cifar10";>cifar10
 example folder</a> has a model.conf file.</p>
 <div class="section">
+<div class="section">
 <h3><a name="NeuralNet"></a>NeuralNet</h3>
 <div class="section">
 <h4><a name="Uniform_model_neuralnet_representation"></a>Uniform model 
(neuralnet) representation</h4>

svn commit: r964016 [2/5] - in /websites/staging/singa/trunk/content: ./ docs/

Reply via email to