svn commit: r964016 [5/5] - in /websites/staging/singa/trunk/content: ./ docs/

buildbot Wed, 02 Sep 2015 03:32:56 -0700

Modified: websites/staging/singa/trunk/content/docs/rnn.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/rnn.html (original)
+++ websites/staging/singa/trunk/content/docs/rnn.html Wed Sep  2 10:31:57 2015
@@ -1,15 +1,15 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
-    <title>Apache SINGA &#x2013; Recurrent neural networks (RNN)</title>
+    <title>Apache SINGA &#x2013; RNN Example</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
     <link rel="stylesheet" href="../css/site.css" />
     <link rel="stylesheet" href="../css/print.css" media="print" />
@@ -189,7 +189,7 @@
         Apache SINGA</a>
                     <span class="divider">/</span>
       </li>
-        <li class="active ">Recurrent neural networks (RNN)</li>
+        <li class="active ">RNN Example</li>
         
                 
                     
@@ -425,21 +425,50 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            <div class="section">
-<h2><a name="Recurrent_neural_networks_RNN"></a>Recurrent neural networks 
(RNN)</h2>
-<p>Example files for RNN can be found in 
&#x201c;SINGA_ROOT/examples/rnnlm&#x201d;, which we assume to be WORKSPACE.</p>
-<div class="section">
-<h3><a name="Create_DataShard"></a>Create DataShard</h3>
-<p>(a) Define your own record. Please refer to <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/data.html";>Data Preparation</a> 
for details.</p>
-<p>Records for RNN example are defined in &#x201c;user.proto&#x201d; as an 
extension.</p>
+            <h1>RNN Example</h1>
+<p>Recurrent Neural Networks (RNN) are widely used for modeling sequential 
data, such as music, videos and sentences. In this example, we use SINGA to 
train a <a class="externalLink" 
href="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf";>RNN
 model</a> proposed by Tomas Mikolov for <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Language_model";>language modeling</a>. The 
training objective (loss) is minimize the <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Perplexity";>perplexity per word</a>, which 
is equivalent to maximize the probability of predicting the next word given the 
current word in a sentence.</p>
+<p>Different to the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/cnn";>CNN</a>, <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/mlp";>MLP</a> 
and <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/rbm";>RBM</a> examples which use 
built-in <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer";>Layer</a>s and <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/data";>Record</a>s, none of the 
layers in this model is built-in. Hence users can get examples of implementing 
their own Layers and data Records in this page.</p>
+<div class="section">
+<h2><a name="Running_instructions"></a>Running instructions</h2>
+<p>In <i>SINGA_ROOT/examples/rnn/</i>, scripts are provided to run the 
training job. First, the data is prepared by</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">package singa;
+<div class="source"><pre class="prettyprint">$ cp Makefile.example Makefile
+$ make download
+$ make create
+</pre></div></div>
+<p>Second, the training is started by passing the job configuration as,</p>
 
-import &quot;common.proto&quot;;  // Record message for SINGA is defined
-import &quot;job.proto&quot;;     // Layer message for SINGA is defined
+<div class="source">
+<div class="source"><pre class="prettyprint"># in SINGA_ROOT
+$ ./bin/singa-run.sh -conf SINGA_ROOT/examples/rnn/job.conf
+</pre></div></div></div>
+<div class="section">
+<h2><a name="Implementations"></a>Implementations</h2>
+<p><img src="http://singa.incubator.apache.org/assets/image/rnn-refine.png"; 
align="center" width="300px" alt="" /> <span><b>Figure 1 - Net structure of the 
RNN model.</b></span></p>
+<p>The neural net structure is shown Figure 1. Word records are loaded by 
<tt>RnnlmDataLayer</tt> from <tt>WordShard</tt>. <tt>RnnlmWordparserLayer</tt> 
parses word records to get word indexes (in the vocabulary). For every 
iteration, <tt>window_size</tt> words are processed. 
<tt>RnnlmWordinputLayer</tt> looks up a word embedding matrix to extract 
feature vectors for words in the window. These features are transformed by 
<tt>RnnlmInnerproductLayer</tt> layer and <tt>RnnlmSigmoidLayer</tt>. 
<tt>RnnlmSigmoidLayer</tt> is a recurrent layer that forwards features from 
previous words to next words. Finally, <tt>RnnlmComputationLayer</tt> computes 
the perplexity loss with word class information from 
<tt>RnnlmClassparserLayer</tt>. The word class is a cluster ID. Words are 
clustered based on their frequency in the dataset, e.g., frequent words are 
clustered together and less frequent words are clustered together. Clustering 
is to improve the efficiency of the final prediction process.</p>
+<div class="section">
+<h3><a name="Data_preparation"></a>Data preparation</h3>
+<p>We use a small dataset in this example. In this dataset, [dataset 
description, e.g., format]. The subsequent steps follow the instructions in <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/data";>Data 
Preparation</a> to convert the raw data into <tt>Record</tt>s and insert them 
into <tt>DataShard</tt>s.</p>
+<div class="section">
+<h4><a name="Download_source_data"></a>Download source data</h4>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in SINGA_ROOT/examples/rnn/
+wget http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz
+xxx
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Define_your_own_record."></a>Define your own record.</h4>
+<p>Since this dataset has different format as the built-in 
<tt>SingleLabelImageRecord</tt>, we need to extend the base <tt>Record</tt> to 
add new fields,</p>
 
-extend Record {
+<div class="source">
+<div class="source"><pre class="prettyprint"># in 
SINGA_ROOT/examples/rnn/user.proto
+package singa;
+
+import &quot;common.proto&quot;;  // import SINGA Record
+
+extend Record {  // extend base Record to include users' records
     optional WordClassRecord wordclass = 101;
     optional SingleWordRecord singleword = 102;
 }
@@ -455,23 +484,69 @@ message SingleWordRecord {
     optional int32 word_index = 2;   // the index of this word in the 
vocabulary
     optional int32 class_index = 3;   // the index of the class corresponding 
to this word
 }
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Create_data_shard_for_training_and_testing"></a>Create data shard 
for training and testing</h4>
+<p>{% comment %} As the vocabulary size is very large, the original perplexity 
calculation method is time consuming. Because it has to calculate the 
probabilities of all possible words for</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">p(wt|w0, w1, ... wt-1).
 </pre></div></div>
-<p>(b) Download raw data</p>
-<p>This example downloads rnnlm-0.4b from <a 
href="www.rnnlm.org">www.rnnlm.org</a> by a command </p>
+<p>Tomas proposed to divide all words into different classes according to the 
word frequency, and compute the perplexity according to</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">make download
+<div class="source"><pre class="prettyprint">p(wt|w0, w1, ... wt-1) = 
p(c|w0,w1,..wt-1) p(w|c)
 </pre></div></div>
-<p>The raw data is stored in a folder &#x201c;rnnlm-0.4b/train&#x201d; and 
&#x201c;rnnlm-0.4b/test&#x201d;.</p>
-<p>(c) Create data shard for training and testing</p>
-<p>Data shards (e.g., &#x201c;shard.dat&#x201d;) will be created in 
&#x201c;rnnlm_class_shard&#x201d;, &#x201c;rnnlm_vocab_shard&#x201d;, 
&#x201c;rnnlm_word_shard_train&#x201d; and 
&#x201c;rnnlm_word_shard_test&#x201d; by a command</p>
+<p>where <tt>c</tt> is the word class, <tt>w0, w1...wt-1</tt> are the previous 
words before <tt>wt</tt>. The probabilities on the right side can be computed 
faster than</p>
+<p><a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/examples/rnnlm/Makefile";>Makefile</a>
 for creating the shards (see in  <a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/examples/rnnlm/create_shard.cc";>create_shard.cc</a>),
  we need to specify where to download the source data, number of classes we  
want to divide all occurring words into, and all the shards together with  
their names, directories we want to create. {% endcomment %}</p>
+<p><i>SINGA_ROOT/examples/rnn/create_shard.cc</i> defines the following 
function for creating data shards,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void create_shard(const char 
*input, int nclass) {
+</pre></div></div>
+<p><tt>input</tt> is the path to [the text file], <tt>nclass</tt> is user 
specified cluster size. This function starts with</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">  using StrIntMap = 
std::map&lt;std::string, int&gt;;
+  StrIntMap *wordIdxMapPtr; //  Mapping word string to a word index
+  StrIntMap *wordClassIdxMapPtr;    // Mapping word string to a word class 
index
+  if (-1 == nclass) {
+      loadClusterForNonTrainMode(input, nclass, &amp;wordIdxMap, 
&amp;wordClassIdxMap); // non-training phase
+  } else {
+      doClusterForTrainMode(input, nclass, &amp;wordIdxMap, 
&amp;wordClassIdxMap); // training phase
+  }
+</pre></div></div>
+
+<ul>
+  
+<li>If <tt>-1 == nclass</tt>, <tt>path</tt> points to the training data file. 
<tt>doClusterForTrainMode</tt>  reads all the words in the file to create the 
two maps. [The two maps are stored in xxx]</li>
+  
+<li>otherwise, <tt>path</tt> points to either test or validation data file. 
<tt>loadClusterForNonTrainMode</tt>  loads the two maps from [xxx].</li>
+</ul>
+<p>Words from training/text/validation files are converted into 
<tt>Record</tt>s by</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">  singa::SingleWordRecord 
*wordRecord = record.MutableExtension(singa::singleword);
+  while (in &gt;&gt; word) {
+    wordRecord-&gt;set_word(word);
+    wordRecord-&gt;set_word_index(wordIdxMap[word]);
+    wordRecord-&gt;set_class_index(wordClassIdxMap[word]);
+    snprintf(key, kMaxKeyLength, &quot;%08d&quot;, wordIdxMap[word]);
+    wordShard.Insert(std::string(key), record);
+  }
+}
+</pre></div></div>
+<p>Compilation and running commands are provided in the 
<i>Makefile.example</i>. After executing</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint">make create
-</pre></div></div></div>
+</pre></div></div>
+<p>, three data shards will created using the <tt>create_shard.cc</tt>, 
namely, <i>rnnlm_word_shard_train</i>, <i>rnnlm_word_shard_test</i> and 
<i>rnnlm_word_shard_valid</i>.</p></div></div>
 <div class="section">
-<h3><a name="Define_Layers"></a>Define Layers</h3>
-<p>Similar to records, layers are also defined in &#x201c;user.proto&#x201d; 
as an extension.</p>
+<h3><a name="Layer_implementation"></a>Layer implementation</h3>
+<p>7 layers (i.e., Layer subclasses) are implemented for this application, 
including 1 <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#data-layers";>data layer</a> 
which fetches data records from data shards, 2 <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#parser-layers";>parser 
layers</a> which parses the input records, 3 neuron layers which transforms the 
word features and 1 loss layer which computes the objective loss.</p>
+<p>First, we illustrate the data shard and how to create it for this 
application. Then, we discuss the configuration and functionality of layers. 
Finally, we introduce how to configure a job and then run the training for your 
own model.</p>
+<p>Following the guide for implementing <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#implementing-a-new-layer-subclass";>new
 Layer subclasses</a>, we extend the <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1LayerProto.html";>LayerProto</a>
 to include the configuration message of each user-defined layer as shown below 
(5 out of the 7 layers have specific configurations),</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint">package singa;
@@ -487,56 +562,285 @@ extend LayerProto {
     optional RnnlmWordinputProto rnnlmwordinput_conf = 204;
     optional RnnlmDataProto rnnlmdata_conf = 207;
 }
+</pre></div></div>
+<p>In the subsequent sections, we describe the implementation of each layer, 
including it configuration message.</p></div>
+<div class="section">
+<h3><a name="RnnlmDataLayer"></a>RnnlmDataLayer</h3>
+<p>It inherits <a href="/api/classsinga_1_1DataLayer.html">DataLayer</a> for 
loading word and class <tt>Record</tt>s from <tt>DataShard</tt>s into 
memory.</p>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
 
-// 1-Message that stores parameters used by RnnlmComputationLayer
-message RnnlmComputationProto {
-    optional bool bias_term = 1 [default = true];  // use bias vector or not
+<div class="source">
+<div class="source"><pre class="prettyprint">void RnnlmDataLayer::Setup() {
+  read records from ClassShard to construct mapping from word string to class 
index
+  Resize length of records_ as window_size + 1
+  Read 1st word record to the last position
 }
 
-// 2-Message that stores parameters used by RnnlmSigmoidLayer
-message RnnlmSigmoidProto {
-    optional bool bias_term = 1 [default = true];  // use bias vector or not
+
+void RnnlmDataLayer::ComputeFeature() {
+    records_[0] = records_[windowsize_];    //Copy the last record to 1st 
position in the record vector
+  Assign values to records_;    //Read window_size new word records from 
WordShard
 }
+</pre></div></div>
+<p>The <tt>Steup</tt> function load the mapping (from word string to class 
index) from <i>ClassShard</i>.</p>
+<p>Every time the <tt>ComputeFeature</tt> function is called, it loads 
<tt>windowsize_</tt> records from <tt>WordShard</tt>.</p>
+<p>[For the consistency of operations at each training iteration, it maintains 
a record vector (length of window_size + 1). It reads the 1st record from the 
WordShard and puts it in the last position of record vector].</p></div>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
 
-// 3-Message that stores parameters used by RnnlmInnerproductLayer
-message RnnlmInnerproductProto {
-    required int32 num_output = 1;  // number of outputs for the layer
-    optional bool bias_term = 30 [default = true];  // use bias vector or not
+<div class="source">
+<div class="source"><pre class="prettyprint">message RnnlmDataProto {
+    required string class_path = 1;   // path to the class data file/folder, 
absolute or relative to the workspace
+    required string word_path = 2;    // path to the word data file/folder, 
absolute or relative to the workspace
+    required int32 window_size = 3;   // window size.
 }
+</pre></div></div>
+<p>[class_path to file or folder?]</p>
+<p>[There two paths, <tt>class_path</tt> for &#x2026;; <tt>word_path</tt> 
for.. The <tt>window_size</tt> is set to &#x2026;]</p></div></div>
+<div class="section">
+<h3><a name="RnnlmWordParserLayer"></a>RnnlmWordParserLayer</h3>
+<p>This layer gets <tt>window_size</tt> word strings from the 
<tt>RnnlmDataLayer</tt> and looks up the word string to word index map to get 
word indexes.</p>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
 
-// 4-Message that stores parameters used by RnnlmWordinputLayer
-message RnnlmWordinputProto {
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmWordparserLayer::Setup(){
+    Obtain window size from src layer;
+    Obtain vocabulary size from src layer;
+    Reshape data_ as {window_size};
+}
+
+void RnnlmWordparserLayer::ParseRecords(Blob* blob){
+  for each word record in the window, get its word index and insert the index 
into blob
+}
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>This layer does not have specific configuration fields.</p></div></div>
+<div class="section">
+<h3><a name="RnnlmClassParserLayer"></a>RnnlmClassParserLayer</h3>
+<p>It maps each word in the processing window into a class index.</p>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmClassparserLayer::Setup(){
+  Obtain window size from src layer;
+  Obtain vocaubulary size from src layer;
+  Obtain class size from src layer;
+  Reshape data_ as {windowsize_, 4};
+}
+
+void RnnlmClassparserLayer::ParseRecords(){
+  for(int i = 1; i &lt; records.size(); i++){
+      Copy starting word index in this class to data[i]'s 1st position;
+      Copy ending word index in this class to data[i]'s 2nd position;
+      Copy index of input word to data[i]'s 3rd position;
+      Copy class index of input word to data[i]'s 4th position;
+  }
+}
+</pre></div></div>
+<p>The setup function read</p></div>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>This layer fetches the class information (the mapping information between 
classes and words) from RnnlmDataLayer and maintains this information as data 
in this layer.</p>
+<p>Next, this layer parses the last &#x201c;window_size&#x201d; number of word 
records from RnnlmDataLayer and stores them as data. Then, it retrieves the 
corresponding class for each input word. It stores the starting word index of 
this class, ending word index of this class, word index and class index 
respectively.</p></div></div>
+<div class="section">
+<h3><a name="RnnlmWordInputLayer"></a>RnnlmWordInputLayer</h3>
+<p>Using the input word records, this layer obtains corresponding word vectors 
as its data. Then, it passes the data to RnnlmInnerProductLayer above for 
further processing.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>In this layer, the length of each word vector needs to be configured. 
Besides, whether to use bias term during the training process should also be 
configured (See more in <a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/src/proto/job.proto";>job.proto</a>).</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">message RnnlmWordinputProto {
     required int32 word_length = 1;  // vector length for each input word
     optional bool bias_term = 30 [default = true];  // use bias vector or not
 }
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
+<p>In setup phase, this layer first reshapes its members such as 
&#x201c;data&#x201d;, &#x201c;grad&#x201d;, and &#x201c;weight&#x201d; matrix. 
Then, it obtains the vocabulary size from its source layer (i.e., 
RnnlmWordParserLayer).</p>
+<p>In the forward phase, using the &#x201c;window_size&#x201d; number of input 
word indices, the &#x201c;window_size&#x201d; number of word vectors are 
selected from this layer&#x2019;s weight matrix, each word index corresponding 
to one row.</p>
 
-// 5-Message that stores parameters used by RnnlmWordparserLayer - nothing 
needs to be configured
-//message RnnlmWordparserProto {
-//}
-
-// 6-Message that stores parameters used by RnnlmClassparserLayer - nothing 
needs to be configured
-//message RnnlmClassparserProto {
-//}
-
-// 7-Message that stores parameters used by RnnlmDataLayer
-message RnnlmDataProto {
-    required string class_path = 1;   // path to the data file/folder, 
absolute or relative to the workspace
-    required string word_path = 2;
-    required int32 window_size = 3;   // window size.
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmWordinputLayer::ComputeFeature() {
+    for(int t = 0; t &lt; windowsize_; t++){
+        data[t] = weight[src[t]];
+    }
+}
+</pre></div></div>
+<p>In the backward phase, after computing this layer&#x2019;s gradient in its 
destination layer (i.e., RnnlmInnerProductLayer), here the gradient of the 
weight matrix in this layer is copied (by row corresponding to word indices) 
from this layer&#x2019;s gradient.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmWordinputLayer::ComputeGradient() {
+    for(int t = 0; t &lt; windowsize_; t++){
+        gweight[src[t]] = grad[t];
+    }
+}
+</pre></div></div></div></div>
+<div class="section">
+<h3><a name="RnnlmInnerProductLayer"></a>RnnlmInnerProductLayer</h3>
+<p>This is a neuron layer which receives the data from RnnlmWordInputLayer and 
sends the computation results to RnnlmSigmoidLayer.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>In this layer, the number of neurons needs to be specified. Besides, 
whether to use a bias term should also be configured.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">message RnnlmInnerproductProto {
+    required int32 num_output = 1;  //Number of outputs for the layer
+    optional bool bias_term = 30 [default = true];  //Use bias vector or not
+}
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
+<p>In the forward phase, this layer is in charge of executing the dot 
multiplication between its weight matrix and the data in its source layer 
(i.e., RnnlmWordInputLayer).</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmInnerproductLayer::ComputeFeature() {
+    data = dot(src, weight);    //Dot multiplication operation
+}
+</pre></div></div>
+<p>In the backward phase, this layer needs to first compute the gradient of 
its source layer (i.e., RnnlmWordInputLayer). Then, it needs to compute the 
gradient of its weight matrix by aggregating computation results for each 
timestamp. The details can be seen as follows.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmInnerproductLayer::ComputeGradient() {
+    for (int t = 0; t &lt; windowsize_; t++) {
+        Add the dot product of src[t] and grad[t] to gweight;
+    }
+    Copy the dot product of grad and weight to gsrc;
+}
+</pre></div></div></div></div>
+<div class="section">
+<h3><a name="RnnlmSigmoidLayer"></a>RnnlmSigmoidLayer</h3>
+<p>This is a neuron layer for computation. During the computation in this 
layer, each component of the member data specific to one timestamp uses its 
previous timestamp&#x2019;s data component as part of the input. This is how 
the time-order information is utilized in this language model application.</p>
+<p>Besides, if you want to implement a recurrent neural network following our 
design, this layer is of vital importance for you to refer to. Also, you can 
always think of other design methods to make use of information from past 
timestamps.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>In this layer, whether to use a bias term needs to be specified.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">message RnnlmSigmoidProto {
+    optional bool bias_term = 1 [default = true];  // use bias vector or not
+}
+</pre></div></div></div>
+<div class="section">
+<h4><a name="Functionality"></a>Functionality</h4>
+<p>In the forward phase, this layer first receives data from its source layer 
(i.e., RnnlmInnerProductLayer) which is used as one part input for computation. 
Then, for each timestampe this layer executes a dot multiplication between its 
previous timestamp information and its own weight matrix. The results are the 
other part for computation. This layer sums these two parts together and 
executes an activation operation. The detailed descriptions for this process 
are illustrated as follows.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmSigmoidLayer::ComputeFeature() {
+    for(int t = 0; t &lt; window_size; t++){
+        if(t == 0) Copy the sigmoid results of src[t] to data[t];
+        else Compute the dot product of data[t - 1] and weight, and add 
sigmoid results of src[t] to be data[t];
+   }
+}
+</pre></div></div>
+<p>In the backward phase, this RnnlmSigmoidLayer first updates this 
layer&#x2019;s member grad utilizing the information from current 
timestamp&#x2019;s next timestamp. Then respectively, this layer computes the 
gradient for its weight matrix and its source layer RnnlmInnerProductLayer by 
iterating different timestamps. The process can be seen below.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmSigmoidLayer::ComputeGradient(){
+    Update grad[t]; // Update the gradient for the current layer, add a new 
term from next timestamp
+    for (int t = 0; t &lt; windowsize_; t++) {
+            Update gweight; // Compute the gradient for the weight matrix
+            Compute gsrc[t];    // Compute the gradient for src layer
+    }
+}
+</pre></div></div></div></div>
+<div class="section">
+<h3><a name="RnnlmComputationLayer"></a>RnnlmComputationLayer</h3>
+<p>This layer is a loss layer in which the performance metrics, both the 
probability of predicting the next word correctly, and perplexity (PPL in 
short) are computed. To be specific, this layer is composed of the class 
information part and the word information part. Therefore, the computation can 
be essentially divided into two parts by slicing this layer&#x2019;s weight 
matrix.</p>
+<div class="section">
+<h4><a name="Configuration"></a>Configuration</h4>
+<p>In this layer, it is needed to specify whether to use a bias term during 
training.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">message RnnlmComputationProto {
+    optional bool bias_term = 1 [default = true];  // use bias vector or not
 }
 </pre></div></div></div>
 <div class="section">
-<h3><a name="Configure_Job"></a>Configure Job</h3>
+<h4><a name="Functionality"></a>Functionality</h4>
+<p>In the forward phase, by using the two sliced weight matrices (one is for 
class information, another is for the words in this class), this 
RnnlmComputationLayer calculates the dot product between the source 
layer&#x2019;s input and the sliced matrices. The results can be denoted as 
&#x201c;y1&#x201d; and &#x201c;y2&#x201d;. Then after a softmax function, for 
each input word, the probability distribution of classes and the words in this 
classes are computed. The activated results can be denoted as p1 and p2. Next, 
using the probability distribution, the PPL value is computed.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmComputationLayer::ComputeFeature() {
+    Compute y1 and y2;
+    p1 = Softmax(y1);
+    p2 = Softmax(y2);
+    Compute perplexity value PPL;
+}
+</pre></div></div>
+<p>In the backward phase, this layer executes the following three computation 
operations. First, it computes the member gradient of the current layer by each 
timestamp. Second, this layer computes the gradient of its own weight matrix by 
aggregating calculated results from all timestamps. Third, it computes the 
gradient of its source layer, RnnlmSigmoidLayer, timestamp-wise.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">void 
RnnlmComputationLayer::ComputeGradient(){
+    Compute grad[t] for all timestamps;
+    Compute gweight by aggregating results computed in different timestamps;
+    Compute gsrc[t] for all timestamps;
+}
+</pre></div></div></div></div></div>
+<div class="section">
+<h2><a name="Updater_Configuration"></a>Updater Configuration</h2>
+<p>We employ kFixedStep type of the learning rate change method and the 
configuration is as follows. We use different learning rate values in different 
step ranges. <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/updater/";>Here</a> is more 
information about choosing updaters.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">updater{
+    #weight_decay:0.0000001
+    lr_change: kFixedStep
+    type: kSGD
+    fixedstep_conf:{
+      step:0
+      step:42810
+      step:49945
+      step:57080
+      step:64215
+      step_lr:0.1
+      step_lr:0.05
+      step_lr:0.025
+      step_lr:0.0125
+      step_lr:0.00625
+    }
+}
+</pre></div></div></div>
+<div class="section">
+<h2><a name="TrainOneBatch_Function"></a>TrainOneBatch() Function</h2>
+<p>We use BP (BackPropagation) algorithm to train the RNN model here. The 
corresponding configuration can be seen below.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># In job.conf file
+alg: kBackPropagation
+</pre></div></div>
+<p>Refer to <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/train-one-batch/";>here</a> 
for more information on different TrainOneBatch() functions.</p></div>
+<div class="section">
+<h2><a name="Cluster_Configuration"></a>Cluster Configuration</h2>
+<p>In this RNN language model, we configure the cluster topology as 
follows.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">cluster {
+  nworker_groups: 1
+  nserver_groups: 1
+  nservers_per_group: 1
+  nworkers_per_group: 1
+  nservers_per_procs: 1
+  nworkers_per_procs: 1
+  workspace: &quot;examples/rnnlm/&quot;
+}
+</pre></div></div>
+<p>This is to train the model in one node. For other configuration choices, 
please refer to <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/frameworks/";>here</a>.</p></div>
+<div class="section">
+<h2><a name="Configure_Job"></a>Configure Job</h2>
 <p>Job configuration is written in &#x201c;job.conf&#x201d;.</p>
 <p>Note: Extended field names should be embraced with square-parenthesis [], 
e.g., [singa.rnnlmdata_conf].</p></div>
 <div class="section">
-<h3><a name="Run_Training"></a>Run Training</h3>
+<h2><a name="Run_Training"></a>Run Training</h2>
 <p>Start training by the following commands</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint">cd SINGA_ROOT
 ./bin/singa-run.sh -workspace=examples/rnnlm
-</pre></div></div></div></div>
+</pre></div></div></div>
                   </div>
             </div>
           </div>


Added: websites/staging/singa/trunk/content/docs/train-one-batch.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/train-one-batch.html (added)
+++ websites/staging/singa/trunk/content/docs/train-one-batch.html Wed Sep  2 
10:31:57 2015
@@ -0,0 +1,583 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2015-09-02 
+ | Rendered using Apache Maven Fluido Skin 1.4
+-->
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Apache SINGA &#x2013; Train-One-Batch</title>
+    <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
+    <link rel="stylesheet" href="../css/site.css" />
+    <link rel="stylesheet" href="../css/print.css" media="print" />
+
+      
+    
+    
+  
+    <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
+
+    
+                  </head>
+        <body class="topBarEnabled">
+          
+    
+    
+            
+    
+        
+    <a href="https://github.com/apache/incubator-singa";>
+      <img style="position: absolute; top: 0; right: 0; border: 0; z-index: 
10000;"
+        
src="https://s3.amazonaws.com/github/ribbons/forkme_right_orange_ff7600.png";
+        alt="Fork me on GitHub">
+    </a>
+  
+                
+                    
+                
+
+    <div id="topbar" class="navbar navbar-fixed-top navbar-inverse">
+      <div class="navbar-inner">
+                <div class="container-fluid">
+        <a data-target=".nav-collapse" data-toggle="collapse" class="btn 
btn-navbar">
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </a>
+                
+                                <ul class="nav">
+                          <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Apache 
SINGA <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../index.html"  
title="Welcome">Welcome</a>
+</li>
+                  
+                      <li>      <a href="../introduction.html"  
title="Introduction">Introduction</a>
+</li>
+                  
+                      <li>      <a href="../quick-start.html"  title="Quick 
Start">Quick Start</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" 
data-toggle="dropdown">Documentaion <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../docs/installation.html"  
title="Installation">Installation</a>
+</li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a href="../docs/programmer-guide.html"  
title="Programmer Guide">Programmer Guide</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../docs/model-config.html"  title="Model Configuration">Model 
Configuration</a>
+</li>
+                                  <li>      <a href="../docs/neuralnet.html"  
title="Neural Network">Neural Network</a>
+</li>
+                                  <li>      <a href="../docs/layer.html"  
title="Layer">Layer</a>
+</li>
+                                  <li>      <a href="../docs/param.html"  
title="Param">Param</a>
+</li>
+                              </ul>
+            </li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a 
href="../docs/distributed-training.html"  title="Distributed 
Training">Distributed Training</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../docs/architecture.html"  title="System Architecture">System 
Architecture</a>
+</li>
+                                  <li>      <a href="../docs/frameworks.html"  
title="Frameworks">Frameworks</a>
+</li>
+                                  <li>      <a 
href="../docs/communication.html"  title="Communication">Communication</a>
+</li>
+                              </ul>
+            </li>
+                  
+                      <li>      <a href="../docs/data.html"  title="Data 
Preparation">Data Preparation</a>
+</li>
+                  
+                      <li>      <a href="../docs/checkpoint.html"  
title="Checkpoint">Checkpoint</a>
+</li>
+                  
+                      <li>      <a href="../docs/examples.html"  
title="Examples">Examples</a>
+</li>
+                  
+                      <li>      <a href="../docs/debug.html"  
title="Debug">Debug</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Development 
<b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../develop/schedule.html"  
title="Schedule">Schedule</a>
+</li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a href="../develop/how-contribute.html" 
 title="How to Contribute">How to Contribute</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../develop/contribute-code.html"  title="Code">Code</a>
+</li>
+                                  <li>      <a 
href="../develop/contribute-docs.html"  title="Documentation">Documentation</a>
+</li>
+                              </ul>
+            </li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Community 
<b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../community/source-repository.html"  
title="Source Repository">Source Repository</a>
+</li>
+                  
+                      <li>      <a href="../community/mail-lists.html"  
title="Mailing Lists">Mailing Lists</a>
+</li>
+                  
+                      <li>      <a href="../community/issue-tracking.html"  
title="Issue Tracking">Issue Tracking</a>
+</li>
+                  
+                      <li>      <a href="../community/team-list.html"  
title="SINGA Team">SINGA Team</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">External 
Links <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="http://www.apache.org/";  
title="Apache Software Foundation">Apache Software Foundation</a>
+</li>
+                  
+                      <li>      <a 
href="http://www.comp.nus.edu.sg/~dbsystem/singa/";  title="NUS School of 
Computing">NUS School of Computing</a>
+</li>
+                          </ul>
+      </li>
+                  </ul>
+          
+          
+          
+                   
+                      </div>
+          
+        </div>
+      </div>
+    </div>
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                                  <a href="../index.html" 
id="bannerLeft" title="Apache SINGA">
+                                                                               
                 <img src="../images/singa-logo.png"  alt="Apache SINGA"/>
+                </a>
+                      </div>
+        <div class="pull-right">              <div id="bannerRight">
+                                                                               
                 <img src="../images/singa-title.png"  alt="Apache SINGA"/>
+                </div>
+      </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="../index.html" title="Apache SINGA">
+        Apache SINGA</a>
+                    <span class="divider">/</span>
+      </li>
+        <li class="active ">Train-One-Batch</li>
+        
+                
+                    
+      
+                            </ul>
+      </div>
+
+                  
+      <div class="row-fluid">
+        <div id="leftColumn" class="span2">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">Apache SINGA</li>
+                              
+      <li>
+  
+                          <a href="../index.html" title="Welcome">
+          <span class="none"></span>
+        Welcome</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../introduction.html" title="Introduction">
+          <span class="none"></span>
+        Introduction</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../quick-start.html" title="Quick Start">
+          <span class="none"></span>
+        Quick Start</a>
+            </li>
+                              <li class="nav-header">Documentaion</li>
+                              
+      <li>
+  
+                          <a href="../docs/installation.html" 
title="Installation">
+          <span class="none"></span>
+        Installation</a>
+            </li>
+                                                                               
                                   
+      <li>
+  
+                          <a href="../docs/programmer-guide.html" 
title="Programmer Guide">
+          <span class="icon-chevron-down"></span>
+        Programmer Guide</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../docs/model-config.html" title="Model 
Configuration">
+          <span class="none"></span>
+        Model Configuration</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/neuralnet.html" title="Neural 
Network">
+          <span class="none"></span>
+        Neural Network</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/layer.html" title="Layer">
+          <span class="none"></span>
+        Layer</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/param.html" title="Param">
+          <span class="none"></span>
+        Param</a>
+            </li>
+              </ul>
+        </li>
+                                                                               
                 
+      <li>
+  
+                          <a href="../docs/distributed-training.html" 
title="Distributed Training">
+          <span class="icon-chevron-down"></span>
+        Distributed Training</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../docs/architecture.html" title="System 
Architecture">
+          <span class="none"></span>
+        System Architecture</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/frameworks.html" title="Frameworks">
+          <span class="none"></span>
+        Frameworks</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/communication.html" 
title="Communication">
+          <span class="none"></span>
+        Communication</a>
+            </li>
+              </ul>
+        </li>
+                
+      <li>
+  
+                          <a href="../docs/data.html" title="Data Preparation">
+          <span class="none"></span>
+        Data Preparation</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/checkpoint.html" title="Checkpoint">
+          <span class="none"></span>
+        Checkpoint</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/examples.html" title="Examples">
+          <span class="none"></span>
+        Examples</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/debug.html" title="Debug">
+          <span class="none"></span>
+        Debug</a>
+            </li>
+                              <li class="nav-header">Development</li>
+                              
+      <li>
+  
+                          <a href="../develop/schedule.html" title="Schedule">
+          <span class="none"></span>
+        Schedule</a>
+            </li>
+                                                                              
+      <li>
+  
+                          <a href="../develop/how-contribute.html" title="How 
to Contribute">
+          <span class="icon-chevron-down"></span>
+        How to Contribute</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../develop/contribute-code.html" 
title="Code">
+          <span class="none"></span>
+        Code</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../develop/contribute-docs.html" 
title="Documentation">
+          <span class="none"></span>
+        Documentation</a>
+            </li>
+              </ul>
+        </li>
+                              <li class="nav-header">Community</li>
+                              
+      <li>
+  
+                          <a href="../community/source-repository.html" 
title="Source Repository">
+          <span class="none"></span>
+        Source Repository</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/mail-lists.html" 
title="Mailing Lists">
+          <span class="none"></span>
+        Mailing Lists</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/issue-tracking.html" 
title="Issue Tracking">
+          <span class="none"></span>
+        Issue Tracking</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/team-list.html" title="SINGA 
Team">
+          <span class="none"></span>
+        SINGA Team</a>
+            </li>
+                              <li class="nav-header">External Links</li>
+                              
+      <li>
+  
+                          <a href="http://www.apache.org/"; 
class="externalLink" title="Apache Software Foundation">
+          <span class="none"></span>
+        Apache Software Foundation</a>
+            </li>
+                
+      <li>
+  
+                          <a 
href="http://www.comp.nus.edu.sg/~dbsystem/singa/"; class="externalLink" 
title="NUS School of Computing">
+          <span class="none"></span>
+        NUS School of Computing</a>
+            </li>
+            </ul>
+                
+                    
+                
+          <hr />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                                                                               
                                    <a href="http://incubator.apache.org"; 
title="apache-incubator" class="builtBy">
+        <img class="builtBy"  alt="Apache Incubator" 
src="http://incubator.apache.org/images/egg-logo.png";    />
+      </a>
+                      </div>
+          </div>
+        </div>
+        
+                        
+        <div id="bodyColumn"  class="span10" >
+                                  
+            <h1>Train-One-Batch</h1>
+<p>For each SGD iteration, every worker calls the <tt>TrainOneBatch</tt> 
function to compute gradients of parameters associated with local layers (i.e., 
layers dispatched to it). SINGA has implemented two algorithms for the 
<tt>TrainOneBatch</tt> function. Users select the corresponding algorithm for 
their model in the configuration.</p>
+<div class="section">
+<h2><a name="Basic_user_guide"></a>Basic user guide</h2>
+<div class="section">
+<h3><a name="Back-propagation"></a>Back-propagation</h3>
+<p><a class="externalLink" 
href="http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf";>BP algorithm</a> is 
used for computing gradients of feed-forward models, e.g., <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/cnn";>CNN</a> 
and <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/mlp";>MLP</a>, and <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/rnn";>RNN</a> 
models in SINGA.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in job.conf
+alg: kBP
+</pre></div></div>
+<p>To use the BP algorithm for the <tt>TrainOneBatch</tt> function, users just 
simply configure the <tt>alg</tt> field with <tt>kBP</tt>. If a neural net 
contains user-defined layers, these layers must be implemented properly be to 
consistent with the implementation of the BP algorithm in SINGA (see 
below).</p></div>
+<div class="section">
+<h3><a name="Contrastive_Divergence"></a>Contrastive Divergence</h3>
+<p><a class="externalLink" 
href="http://www.cs.toronto.edu/~fritz/absps/nccd.pdf";>CD algorithm</a> is used 
for computing gradients of energy models like RBM.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># job.conf
+alg: kCD
+cd_conf {
+  cd_k: 2
+}
+</pre></div></div>
+<p>To use the CD algorithm for the <tt>TrainOneBatch</tt> function, users just 
configure the <tt>alg</tt> field to <tt>kCD</tt>. Uses can also configure the 
Gibbs sampling steps in the CD algorthm through the <tt>cd_k</tt> field. By 
default, it is set to 1.</p></div></div>
+<div class="section">
+<h2><a name="Advanced_user_guide"></a>Advanced user guide</h2>
+<div class="section">
+<h3><a name="Implementation_of_BP"></a>Implementation of BP</h3>
+<p>The BP algorithm is implemented in SINGA following the below pseudo 
code,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">BPTrainOnebatch(step, net) {
+  // forward propagate
+  foreach layer in net.local_layers() {
+    if IsBridgeDstLayer(layer)
+      recv data from the src layer (i.e., BridgeSrcLayer)
+    foreach param in layer.params()
+      Collect(param) // recv response from servers for last update
+
+    layer.ComputeFeature(kForward)
+
+    if IsBridgeSrcLayer(layer)
+      send layer.data_ to dst layer
+  }
+  // backward propagate
+  foreach layer in reverse(net.local_layers) {
+    if IsBridgeSrcLayer(layer)
+      recv gradient from the dst layer (i.e., BridgeDstLayer)
+      recv response from servers for last update
+
+    layer.ComputeGradient()
+    foreach param in layer.params()
+      Update(step, param) // send param.grad_ to servers
+
+    if IsBridgeDstLayer(layer)
+      send layer.grad_ to src layer
+  }
+}
+</pre></div></div>
+<p>It forwards features through all local layers (can be checked by layer 
partition ID and worker ID) and backwards gradients in the reverse order. <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer/#bridgesrclayer--bridgedstlayer";>BridgeSrcLayer</a>
 (resp. <tt>BridgeDstLayer</tt>) will be blocked until the feature (resp. 
gradient) from the source (resp. destination) layer comes. Parameter gradients 
are sent to servers via <tt>Update</tt> function. Updated parameters are 
collected via <tt>Collect</tt> function, which will be blocked until the 
parameter is updated. <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/param";>Param</a> objects have 
versions, which can be used to check whether the <tt>Param</tt> objects have 
been updated or not.</p>
+<p>Since RNN models are unrolled into feed-forward models, users need to 
implement the forward propagation in the recurrent layer&#x2019;s 
<tt>ComputeFeature</tt> function, and implement the backward propagation in the 
recurrent layer&#x2019;s <tt>ComputeGradient</tt> function. As a result, the 
whole <tt>TrainOneBatch</tt> runs <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Backpropagation_through_time";>back-propagation
 through time (BPTT)</a> algorithm.</p></div>
+<div class="section">
+<h3><a name="Implementation_of_CD"></a>Implementation of CD</h3>
+<p>The CD algorithm is implemented in SINGA following the below pseudo 
code,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">CDTrainOneBatch(step, net) {
+  # positive phase
+  foreach layer in net.local_layers()
+    if IsBridgeDstLayer(layer)
+      recv positive phase data from the src layer (i.e., BridgeSrcLayer)
+    foreach param in layer.params()
+      Collect(param)  // recv response from servers for last update
+    layer.ComputeFeature(kPositive)
+    if IsBridgeSrcLayer(layer)
+      send positive phase data to dst layer
+
+  # negative phase
+  foreach gibbs in [0...layer_proto_.cd_k]
+    foreach layer in net.local_layers()
+      if IsBridgeDstLayer(layer)
+        recv negative phase data from the src layer (i.e., BridgeSrcLayer)
+      layer.ComputeFeature(kPositive)
+      if IsBridgeSrcLayer(layer)
+        send negative phase data to dst layer
+
+  foreach layer in net.local_layers()
+    layer.ComputeGradient()
+    foreach param in layer.params
+      Update(param)
+}
+</pre></div></div>
+<p>Parameter gradients are computed after the positive phase and negative 
phase.</p></div>
+<div class="section">
+<h3><a name="Implementing_a_new_algorithm"></a>Implementing a new 
algorithm</h3>
+<p>SINGA implements BP and CD by creating two subclasses of the <a 
href="api/classsinga_1_1Worker.html">Worker</a> class: <a 
href="api/classsinga_1_1BPWorker.html">BPWorker</a>&#x2019;s 
<tt>TrainOneBatch</tt> function implements the BP algorithm; <a 
href="api/classsinga_1_1CDWorker.html">CDWorker</a>&#x2019;s 
<tt>TrainOneBatch</tt> function implements the CD algorithm. To implement a new 
algorithm for the <tt>TrainOneBatch</tt> function, users need to create a new 
subclass of the <tt>Worker</tt>, e.g.,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">class FooWorker : public Worker {
+  void TrainOneBatch(int step, shared_ptr&lt;NeuralNet&gt; net, Metric* perf) 
override;
+  void TestOneBatch(int step, Phase phase, shared_ptr&lt;NeuralNet&gt; net, 
Metric* perf) override;
+};
+</pre></div></div>
+<p>The <tt>FooWorker</tt> must implement the above two functions for training 
one mini-batch and testing one mini-batch. The <tt>perf</tt> argument is for 
collecting training or testing performance, e.g., the objective loss or 
accuracy. It is passed to the <tt>ComputeFeature</tt> function of each 
layer.</p>
+<p>Users can define some fields for users to configure</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in user.proto
+message FooWorkerProto {
+  optional int32 b = 1;
+}
+
+extend JobProto {
+  optional FooWorkerProto foo_conf = 101;
+}
+
+# in job.proto
+JobProto {
+  ...
+  extension 101..max;
+}
+</pre></div></div>
+<p>It is similar as <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer/#implementing-a-new-layer-subclass";>adding
 configuration fields for a new layer</a>.</p>
+<p>To use <tt>FooWorker</tt>, users need to register it in the <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/programming-guide";>main.cc</a> and 
configure the <tt>alg</tt> and <tt>foo_conf</tt> fields,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in main.cc
+const int kFoo = 3; // worker ID, must be different to that of CDWorker and 
BPWorker
+driver.RegisterWorker&lt;FooWorker&gt;(kFoo);
+
+# in job.conf
+...
+alg: 3
+[foo_conf] {
+  b = 4;
+}
+</pre></div></div></div></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+                      <div class="row-fluid">
+                                                                          
+<p>Copyright Â© 2015 The Apache Software Foundation. All rights reserved. 
Apache Singa, Apache, the Apache feather logo, and the Apache Singa project 
logos are trademarks of The Apache Software Foundation. All other marks 
mentioned may be trademarks or registered trademarks of their respective 
owners.</p>
+                          </div>
+
+        
+                </div>
+    </footer>
+        </body>
+</html>

Added: websites/staging/singa/trunk/content/docs/updater.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/updater.html (added)
+++ websites/staging/singa/trunk/content/docs/updater.html Wed Sep  2 10:31:57 
2015
@@ -0,0 +1,717 @@
+<!DOCTYPE html>
+<!--
+ | Generated by Apache Maven Doxia at 2015-09-02 
+ | Rendered using Apache Maven Fluido Skin 1.4
+-->
+<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
+    <meta http-equiv="Content-Language" content="en" />
+    <title>Apache SINGA &#x2013; Updater</title>
+    <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
+    <link rel="stylesheet" href="../css/site.css" />
+    <link rel="stylesheet" href="../css/print.css" media="print" />
+
+      
+    
+    
+  
+    <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
+
+    
+                  </head>
+        <body class="topBarEnabled">
+          
+    
+    
+            
+    
+        
+    <a href="https://github.com/apache/incubator-singa";>
+      <img style="position: absolute; top: 0; right: 0; border: 0; z-index: 
10000;"
+        
src="https://s3.amazonaws.com/github/ribbons/forkme_right_orange_ff7600.png";
+        alt="Fork me on GitHub">
+    </a>
+  
+                
+                    
+                
+
+    <div id="topbar" class="navbar navbar-fixed-top navbar-inverse">
+      <div class="navbar-inner">
+                <div class="container-fluid">
+        <a data-target=".nav-collapse" data-toggle="collapse" class="btn 
btn-navbar">
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+          <span class="icon-bar"></span>
+        </a>
+                
+                                <ul class="nav">
+                          <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Apache 
SINGA <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../index.html"  
title="Welcome">Welcome</a>
+</li>
+                  
+                      <li>      <a href="../introduction.html"  
title="Introduction">Introduction</a>
+</li>
+                  
+                      <li>      <a href="../quick-start.html"  title="Quick 
Start">Quick Start</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" 
data-toggle="dropdown">Documentaion <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../docs/installation.html"  
title="Installation">Installation</a>
+</li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a href="../docs/programmer-guide.html"  
title="Programmer Guide">Programmer Guide</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../docs/model-config.html"  title="Model Configuration">Model 
Configuration</a>
+</li>
+                                  <li>      <a href="../docs/neuralnet.html"  
title="Neural Network">Neural Network</a>
+</li>
+                                  <li>      <a href="../docs/layer.html"  
title="Layer">Layer</a>
+</li>
+                                  <li>      <a href="../docs/param.html"  
title="Param">Param</a>
+</li>
+                              </ul>
+            </li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a 
href="../docs/distributed-training.html"  title="Distributed 
Training">Distributed Training</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../docs/architecture.html"  title="System Architecture">System 
Architecture</a>
+</li>
+                                  <li>      <a href="../docs/frameworks.html"  
title="Frameworks">Frameworks</a>
+</li>
+                                  <li>      <a 
href="../docs/communication.html"  title="Communication">Communication</a>
+</li>
+                              </ul>
+            </li>
+                  
+                      <li>      <a href="../docs/data.html"  title="Data 
Preparation">Data Preparation</a>
+</li>
+                  
+                      <li>      <a href="../docs/checkpoint.html"  
title="Checkpoint">Checkpoint</a>
+</li>
+                  
+                      <li>      <a href="../docs/examples.html"  
title="Examples">Examples</a>
+</li>
+                  
+                      <li>      <a href="../docs/debug.html"  
title="Debug">Debug</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Development 
<b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../develop/schedule.html"  
title="Schedule">Schedule</a>
+</li>
+                  
+                      <li class="dropdown-submenu">
+                                      <a href="../develop/how-contribute.html" 
 title="How to Contribute">How to Contribute</a>
+              <ul class="dropdown-menu">
+                                  <li>      <a 
href="../develop/contribute-code.html"  title="Code">Code</a>
+</li>
+                                  <li>      <a 
href="../develop/contribute-docs.html"  title="Documentation">Documentation</a>
+</li>
+                              </ul>
+            </li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">Community 
<b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="../community/source-repository.html"  
title="Source Repository">Source Repository</a>
+</li>
+                  
+                      <li>      <a href="../community/mail-lists.html"  
title="Mailing Lists">Mailing Lists</a>
+</li>
+                  
+                      <li>      <a href="../community/issue-tracking.html"  
title="Issue Tracking">Issue Tracking</a>
+</li>
+                  
+                      <li>      <a href="../community/team-list.html"  
title="SINGA Team">SINGA Team</a>
+</li>
+                          </ul>
+      </li>
+                <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">External 
Links <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+        
+                      <li>      <a href="http://www.apache.org/";  
title="Apache Software Foundation">Apache Software Foundation</a>
+</li>
+                  
+                      <li>      <a 
href="http://www.comp.nus.edu.sg/~dbsystem/singa/";  title="NUS School of 
Computing">NUS School of Computing</a>
+</li>
+                          </ul>
+      </li>
+                  </ul>
+          
+          
+          
+                   
+                      </div>
+          
+        </div>
+      </div>
+    </div>
+    
+        <div class="container-fluid">
+          <div id="banner">
+        <div class="pull-left">
+                                                  <a href="../index.html" 
id="bannerLeft" title="Apache SINGA">
+                                                                               
                 <img src="../images/singa-logo.png"  alt="Apache SINGA"/>
+                </a>
+                      </div>
+        <div class="pull-right">              <div id="bannerRight">
+                                                                               
                 <img src="../images/singa-title.png"  alt="Apache SINGA"/>
+                </div>
+      </div>
+        <div class="clear"><hr/></div>
+      </div>
+
+      <div id="breadcrumbs">
+        <ul class="breadcrumb">
+                
+                    
+                              <li class="">
+                    <a href="../index.html" title="Apache SINGA">
+        Apache SINGA</a>
+                    <span class="divider">/</span>
+      </li>
+        <li class="active ">Updater</li>
+        
+                
+                    
+      
+                            </ul>
+      </div>
+
+                  
+      <div class="row-fluid">
+        <div id="leftColumn" class="span2">
+          <div class="well sidebar-nav">
+                
+                    
+                <ul class="nav nav-list">
+                    <li class="nav-header">Apache SINGA</li>
+                              
+      <li>
+  
+                          <a href="../index.html" title="Welcome">
+          <span class="none"></span>
+        Welcome</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../introduction.html" title="Introduction">
+          <span class="none"></span>
+        Introduction</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../quick-start.html" title="Quick Start">
+          <span class="none"></span>
+        Quick Start</a>
+            </li>
+                              <li class="nav-header">Documentaion</li>
+                              
+      <li>
+  
+                          <a href="../docs/installation.html" 
title="Installation">
+          <span class="none"></span>
+        Installation</a>
+            </li>
+                                                                               
                                   
+      <li>
+  
+                          <a href="../docs/programmer-guide.html" 
title="Programmer Guide">
+          <span class="icon-chevron-down"></span>
+        Programmer Guide</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../docs/model-config.html" title="Model 
Configuration">
+          <span class="none"></span>
+        Model Configuration</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/neuralnet.html" title="Neural 
Network">
+          <span class="none"></span>
+        Neural Network</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/layer.html" title="Layer">
+          <span class="none"></span>
+        Layer</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/param.html" title="Param">
+          <span class="none"></span>
+        Param</a>
+            </li>
+              </ul>
+        </li>
+                                                                               
                 
+      <li>
+  
+                          <a href="../docs/distributed-training.html" 
title="Distributed Training">
+          <span class="icon-chevron-down"></span>
+        Distributed Training</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../docs/architecture.html" title="System 
Architecture">
+          <span class="none"></span>
+        System Architecture</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/frameworks.html" title="Frameworks">
+          <span class="none"></span>
+        Frameworks</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../docs/communication.html" 
title="Communication">
+          <span class="none"></span>
+        Communication</a>
+            </li>
+              </ul>
+        </li>
+                
+      <li>
+  
+                          <a href="../docs/data.html" title="Data Preparation">
+          <span class="none"></span>
+        Data Preparation</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/checkpoint.html" title="Checkpoint">
+          <span class="none"></span>
+        Checkpoint</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/examples.html" title="Examples">
+          <span class="none"></span>
+        Examples</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../docs/debug.html" title="Debug">
+          <span class="none"></span>
+        Debug</a>
+            </li>
+                              <li class="nav-header">Development</li>
+                              
+      <li>
+  
+                          <a href="../develop/schedule.html" title="Schedule">
+          <span class="none"></span>
+        Schedule</a>
+            </li>
+                                                                              
+      <li>
+  
+                          <a href="../develop/how-contribute.html" title="How 
to Contribute">
+          <span class="icon-chevron-down"></span>
+        How to Contribute</a>
+                    <ul class="nav nav-list">
+                    
+      <li>
+  
+                          <a href="../develop/contribute-code.html" 
title="Code">
+          <span class="none"></span>
+        Code</a>
+            </li>
+                    
+      <li>
+  
+                          <a href="../develop/contribute-docs.html" 
title="Documentation">
+          <span class="none"></span>
+        Documentation</a>
+            </li>
+              </ul>
+        </li>
+                              <li class="nav-header">Community</li>
+                              
+      <li>
+  
+                          <a href="../community/source-repository.html" 
title="Source Repository">
+          <span class="none"></span>
+        Source Repository</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/mail-lists.html" 
title="Mailing Lists">
+          <span class="none"></span>
+        Mailing Lists</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/issue-tracking.html" 
title="Issue Tracking">
+          <span class="none"></span>
+        Issue Tracking</a>
+            </li>
+                
+      <li>
+  
+                          <a href="../community/team-list.html" title="SINGA 
Team">
+          <span class="none"></span>
+        SINGA Team</a>
+            </li>
+                              <li class="nav-header">External Links</li>
+                              
+      <li>
+  
+                          <a href="http://www.apache.org/"; 
class="externalLink" title="Apache Software Foundation">
+          <span class="none"></span>
+        Apache Software Foundation</a>
+            </li>
+                
+      <li>
+  
+                          <a 
href="http://www.comp.nus.edu.sg/~dbsystem/singa/"; class="externalLink" 
title="NUS School of Computing">
+          <span class="none"></span>
+        NUS School of Computing</a>
+            </li>
+            </ul>
+                
+                    
+                
+          <hr />
+
+           <div id="poweredBy">
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                            <div class="clear"></div>
+                                                                               
                                    <a href="http://incubator.apache.org"; 
title="apache-incubator" class="builtBy">
+        <img class="builtBy"  alt="Apache Incubator" 
src="http://incubator.apache.org/images/egg-logo.png";    />
+      </a>
+                      </div>
+          </div>
+        </div>
+        
+                        
+        <div id="bodyColumn"  class="span10" >
+                                  
+            <h1>Updater</h1>
+<p>Every server in SINGA has an <a 
href="api/classsinga_1_1Updater.html">Updater</a> instance that updates 
parameters based on gradients. In this page, the <i>Basic user guide</i> 
describes the configuration of an updater. The <i>Advanced user guide</i> 
present details on how to implement a new updater and a new learning rate 
changing method.</p>
+<div class="section">
+<h2><a name="Basic_user_guide"></a>Basic user guide</h2>
+<p>There are many different parameter updating protocols (i.e., subclasses of 
<tt>Updater</tt>). They share some configuration fields like</p>
+
+<ul>
+  
+<li><tt>type</tt>, an integer for identifying an updater;</li>
+  
+<li><tt>learning_rate</tt>, configuration for the <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1LRGenerator.html";>LRGenerator</a>
 which controls the learning rate.</li>
+  
+<li><tt>weight_decay</tt>, the co-efficient for <a class="externalLink" 
href="http://deeplearning.net/tutorial/gettingstarted.html#regularization";>L2 * 
regularization</a>.</li>
+  
+<li><a class="externalLink" 
href="http://ufldl.stanford.edu/tutorial/supervised/OptimizationStochasticGradientDescent/";>momentum</a>.</li>
+</ul>
+<p>If you are not familiar with the above terms, you can get their meanings in 
<a class="externalLink" 
href="http://cs231n.github.io/neural-networks-3/#update";>this page provided by 
Karpathy</a>.</p>
+<div class="section">
+<h3><a name="Configuration_of_built-in_updater_classes"></a>Configuration of 
built-in updater classes</h3>
+<div class="section">
+<h4><a name="Updater"></a>Updater</h4>
+<p>The base <tt>Updater</tt> implements the <a class="externalLink" 
href="http://cs231n.github.io/neural-networks-3/#sgd";>vanilla SGD 
algorithm</a>. Its configuration type is <tt>kSGD</tt>. Users need to configure 
at least the <tt>learning_rate</tt> field. <tt>momentum</tt> and 
<tt>weight_decay</tt> are optional fields.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">updater{
+  type: kSGD
+  momentum: float
+  weight_decay: float
+  learning_rate {
+
+  }
+}
+</pre></div></div></div>
+<div class="section">
+<h4><a name="AdaGradUpdater"></a>AdaGradUpdater</h4>
+<p>It inherits the base <tt>Updater</tt> to implement the <a 
class="externalLink" 
href="http://www.magicbroom.info/Papers/DuchiHaSi10.pdf";>AdaGrad</a> algorithm. 
Its type is <tt>kAdaGrad</tt>. <tt>AdaGradUpdater</tt> is configured similar to 
<tt>Updater</tt> except that <tt>momentum</tt> is not used.</p></div>
+<div class="section">
+<h4><a name="NesterovUpdater"></a>NesterovUpdater</h4>
+<p>It inherits the base <tt>Updater</tt> to implements the <a 
class="externalLink" href="http://arxiv.org/pdf/1212.0901v2.pdf";>Nesterov</a> 
(section 3.5) updating protocol. Its type is <tt>kNesterov</tt>. 
<tt>learning_rate</tt> and <tt>momentum</tt> must be configured. 
<tt>weight_decay</tt> is an optional configuration field.</p></div>
+<div class="section">
+<h4><a name="RMSPropUpdater"></a>RMSPropUpdater</h4>
+<p>It inherits the base <tt>Updater</tt> to implements the <a 
class="externalLink" 
href="http://cs231n.github.io/neural-networks-3/#sgd";>RMSProp algorithm</a> 
proposed by <a class="externalLink" 
href="http://www.cs.toronto.edu/%7Etijmen/csc321/slides/lecture_slides_lec6.pdf";>Hinton</a>(slide
 29). Its type is <tt>kRMSProp</tt>.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">updater {
+  type: kRMSProp
+  rmsprop_conf {
+   rho: float # [0,1]
+  }
+}
+</pre></div></div></div></div>
+<div class="section">
+<h3><a name="Configuration_of_learning_rate"></a>Configuration of learning 
rate</h3>
+<p>The <tt>learning_rate</tt> field is configured as,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  type: ChangeMethod
+  base_lr: float  # base/initial learning rate
+  ... # fields to a specific changing method
+}
+</pre></div></div>
+<p>The common fields include <tt>type</tt> and <tt>base_lr</tt>. SINGA 
provides the following <tt>ChangeMethod</tt>s.</p>
+<div class="section">
+<h4><a name="kFixed"></a>kFixed</h4>
+<p>The <tt>base_lr</tt> is used for all steps.</p></div>
+<div class="section">
+<h4><a name="kLinear"></a>kLinear</h4>
+<p>The updater should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  base_lr:  float
+  linear_conf {
+    freq: int
+    final_lr: float
+  }
+}
+</pre></div></div>
+<p>Linear interpolation is used to change the learning rate,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">lr = (1 - step / freq) * base_lr 
+ (step / freq) * final_lr
+</pre></div></div></div>
+<div class="section">
+<h4><a name="kExponential"></a>kExponential</h4>
+<p>The udapter should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  base_lr: float
+  exponential_conf {
+    freq: int
+  }
+}
+</pre></div></div>
+<p>The learning rate for <tt>step</tt> is</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">lr = base_lr / 2^(step / freq)
+</pre></div></div></div>
+<div class="section">
+<h4><a name="kInverseT"></a>kInverseT</h4>
+<p>The updater should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  base_lr: float
+  inverset_conf {
+    final_lr: float
+  }
+}
+</pre></div></div>
+<p>The learning rate for <tt>step</tt> is</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">lr = base_lr / (1 + step / 
final_lr)
+</pre></div></div></div>
+<div class="section">
+<h4><a name="kInverse"></a>kInverse</h4>
+<p>The updater should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  base_lr: float
+  inverse_conf {
+    gamma: float
+    pow: float
+  }
+}
+</pre></div></div>
+<p>The learning rate for <tt>step</tt> is</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">lr = base_lr * (1 + gamma * 
setp)^(-pow)
+</pre></div></div></div>
+<div class="section">
+<h4><a name="kStep"></a>kStep</h4>
+<p>The updater should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  base_lr : float
+  step_conf {
+    change_freq: int
+    gamma: float
+  }
+}
+</pre></div></div>
+<p>The learning rate for <tt>step</tt> is</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">lr = base_lr * gamma^ (step / 
change_freq)
+</pre></div></div></div>
+<div class="section">
+<h4><a name="kFixedStep"></a>kFixedStep</h4>
+<p>The updater should be configured like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  fixedstep_conf {
+    step: int
+    step_lr: float
+
+    step: int
+    step_lr: float
+
+    ...
+  }
+}
+</pre></div></div>
+<p>Denote the i-th tuple as (step[i], step_lr[i]), then the learning rate for 
<tt>step</tt> is,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">step_lr[k]
+</pre></div></div>
+<p>where step[k] is the smallest number that is larger than 
<tt>step</tt>.</p></div></div></div>
+<div class="section">
+<h2><a name="Advanced_user_guide"></a>Advanced user guide</h2>
+<div class="section">
+<h3><a name="Implementing_a_new_Update_subclass"></a>Implementing a new Update 
subclass</h3>
+<p>The base Updater class has one virtual function,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">class Updater{
+ public:
+  virtual void Update(int step, Param* param, float grad_scale = 1.0f) = 0;
+
+ protected:
+  UpdaterProto proto_;
+  LRGenerator lr_gen_;
+};
+</pre></div></div>
+<p>It updates the values of the <tt>param</tt> based on its gradients. The 
<tt>step</tt> argument is for deciding the learning rate which may change 
through time (step). <tt>grad_scale</tt> scales the original gradient values. 
This function is called by servers once it receives all gradients for the same 
<tt>Param</tt> object.</p>
+<p>To implement a new Updater subclass, users must override the 
<tt>Update</tt> function.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">class FooUpdater : public Updater 
{
+  void Update(int step, Param* param, float grad_scale = 1.0f) override;
+};
+</pre></div></div>
+<p>Configuration of this new updater can be declared similar to that of a new 
layer,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in user.proto
+FooUpdaterProto {
+  optional int32 c = 1;
+}
+
+extend UpdaterProto {
+  optional FooUpdaterProto fooupdater_conf= 101;
+}
+</pre></div></div>
+<p>The new updater should be registered in the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/programming-guide";>main 
function</a></p>
+
+<div class="source">
+<div class="source"><pre 
class="prettyprint">driver.RegisterUpdater&lt;FooUpdater&gt;(&quot;FooUpdater&quot;);
+</pre></div></div>
+<p>Users can then configure the job as</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in job.conf
+updater {
+  user_type: &quot;FooUpdater&quot;  # must use user_type with the same string 
identifier as the one used for registration
+  fooupdater_conf {
+    c : 20;
+  }
+}
+</pre></div></div></div>
+<div class="section">
+<h3><a name="Implementing_a_new_LRGenerator_subclass"></a>Implementing a new 
LRGenerator subclass</h3>
+<p>The base <tt>LRGenerator</tt> is declared as,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">virtual float Get(int step);
+</pre></div></div>
+<p>To implement a subclass, e.g., <tt>FooLRGen</tt>, users should declare it 
like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">class FooLRGen : public 
LRGenerator {
+ public:
+  float Get(int step) override;
+};
+</pre></div></div>
+<p>Configuration of <tt>FooLRGen</tt> can be defined using a protocol 
message,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># in user.proto
+message FooLRProto {
+ ...
+}
+
+extend LRGenProto {
+  optional FooLRProto foolr_conf = 101;
+}
+</pre></div></div>
+<p>The configuration is then like,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">learning_rate {
+  user_type : &quot;FooLR&quot; # must use user_type with the same string 
identifier as the one used for registration
+  base_lr: float
+  foolr_conf {
+    ...
+  }
+}
+</pre></div></div>
+<p>Users have to register this subclass in the main function,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">  
driver.RegisterLRGenerator&lt;FooLRGen&gt;(&quot;FooLR&quot;)
+</pre></div></div></div></div>
+                  </div>
+            </div>
+          </div>
+
+    <hr/>
+
+    <footer>
+            <div class="container-fluid">
+                      <div class="row-fluid">
+                                                                          
+<p>Copyright Â© 2015 The Apache Software Foundation. All rights reserved. 
Apache Singa, Apache, the Apache feather logo, and the Apache Singa project 
logos are trademarks of The Apache Software Foundation. All other marks 
mentioned may be trademarks or registered trademarks of their respective 
owners.</p>
+                          </div>
+
+        
+                </div>
+    </footer>
+        </body>
+</html>

Modified: websites/staging/singa/trunk/content/index.html
==============================================================================
--- websites/staging/singa/trunk/content/index.html (original)
+++ websites/staging/singa/trunk/content/index.html Wed Sep  2 10:31:57 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Welcome to Apache SINGA</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />

Modified: websites/staging/singa/trunk/content/introduction.html
==============================================================================
--- websites/staging/singa/trunk/content/introduction.html (original)
+++ websites/staging/singa/trunk/content/introduction.html Wed Sep  2 10:31:57 
2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Introduction</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />

Modified: websites/staging/singa/trunk/content/quick-start.html
==============================================================================
--- websites/staging/singa/trunk/content/quick-start.html (original)
+++ websites/staging/singa/trunk/content/quick-start.html Wed Sep  2 10:31:57 
2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-08-17 
+ | Generated by Apache Maven Doxia at 2015-09-02 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150817" />
+    <meta name="Date-Revision-yyyymmdd" content="20150902" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Quick Start</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />

svn commit: r964016 [5/5] - in /websites/staging/singa/trunk/content: ./ docs/

Reply via email to