Modified: websites/staging/singa/trunk/content/docs/rnn.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/rnn.html (original)
+++ websites/staging/singa/trunk/content/docs/rnn.html Fri Sep 18 15:11:53 2015
@@ -1,15 +1,15 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
-    <title>Apache SINGA &#x2013; RNN Example</title>
+    <title>Apache SINGA &#x2013; </title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
     <link rel="stylesheet" href="../css/site.css" />
     <link rel="stylesheet" href="../css/print.css" media="print" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           
@@ -204,7 +210,7 @@
         Apache SINGA</a>
                     <span class="divider">/</span>
       </li>
-        <li class="active ">RNN Example</li>
+        <li class="active "></li>
         
                 
                     
@@ -482,115 +488,97 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            <h1>RNN Example</h1>
-<p>Recurrent Neural Networks (RNN) are widely used for modeling sequential 
data, such as music, videos and sentences. In this example, we use SINGA to 
train a <a class="externalLink" 
href="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf";>RNN
 model</a> proposed by Tomas Mikolov for <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Language_model";>language modeling</a>. The 
training objective (loss) is minimize the <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Perplexity";>perplexity per word</a>, which 
is equivalent to maximize the probability of predicting the next word given the 
current word in a sentence.</p>
-<p>Different to the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/cnn";>CNN</a>, <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/mlp";>MLP</a> 
and <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/rbm";>RBM</a> examples which use 
built-in <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer";>Layer</a>s and <a 
class="externalLink" 
href="http://singa.incubator.apache.org/docs/data";>Record</a>s, none of the 
layers in this model is built-in. Hence users can get examples of implementing 
their own Layers and data Records in this page.</p>
+            <p>Recurrent Neural Networks for Language Modelling</p>
+<hr />
+<p>Recurrent Neural Networks (RNN) are widely used for modelling sequential 
data, such as music and sentences. In this example, we use SINGA to train a <a 
class="externalLink" 
href="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf";>RNN
 model</a> proposed by Tomas Mikolov for <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Language_model";>language modeling</a>. The 
training objective (loss) is to minimize the <a class="externalLink" 
href="https://en.wikipedia.org/wiki/Perplexity";>perplexity per word</a>, which 
is equivalent to maximize the probability of predicting the next word given the 
current word in a sentence.</p>
+<p>Different to the <a href="cnn.html">CNN</a>, <a href="mlp.html">MLP</a> and 
<a href="rbm.html">RBM</a> examples which use built-in layers(layer) and 
records(data), none of the layers in this example are built-in. Hence users 
would learn to implement their own layers and data records through this 
example.</p>
 <div class="section">
 <h2><a name="Running_instructions"></a>Running instructions</h2>
-<p>In <i>SINGA_ROOT/examples/rnn/</i>, scripts are provided to run the 
training job. First, the data is prepared by</p>
+<p>In <i>SINGA_ROOT/examples/rnnlm/</i>, scripts are provided to run the 
training job. First, the data is prepared by</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint">$ cp Makefile.example Makefile
 $ make download
 $ make create
 </pre></div></div>
-<p>Second, the training is started by passing the job configuration as,</p>
+<p>Second, to compile the source code under <i>examples/rnnlm/</i>, run</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint"># in SINGA_ROOT
-$ ./bin/singa-run.sh -conf SINGA_ROOT/examples/rnn/job.conf
+<div class="source"><pre class="prettyprint">$ make rnnlm
+</pre></div></div>
+<p>An executable file <i>rnnlm.bin</i> will be generated.</p>
+<p>Third, the training is started by passing <i>rnnlm.bin</i> and the job 
configuration to <i>singa-run.sh</i>,</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint"># at SINGA_ROOT/
+# export LD_LIBRARY_PATH=.libs:$LD_LIBRARY_PATH
+$ ./bin/singa-run.sh -exec examples/rnnlm/rnnlm.bin -conf 
examples/rnnlm/job.conf
 </pre></div></div></div>
 <div class="section">
 <h2><a name="Implementations"></a>Implementations</h2>
-<p><img src="http://singa.incubator.apache.org/images/rnn-refine.png"; 
align="center" width="300px" alt="" /> <span><b>Figure 1 - Net structure of the 
RNN model.</b></span></p>
-<p>The neural net structure is shown Figure 1. Word records are loaded by 
<tt>RnnlmDataLayer</tt> from <tt>WordShard</tt>. <tt>RnnlmWordparserLayer</tt> 
parses word records to get word indexes (in the vocabulary). For every 
iteration, <tt>window_size</tt> words are processed. 
<tt>RnnlmWordinputLayer</tt> looks up a word embedding matrix to extract 
feature vectors for words in the window. These features are transformed by 
<tt>RnnlmInnerproductLayer</tt> layer and <tt>RnnlmSigmoidLayer</tt>. 
<tt>RnnlmSigmoidLayer</tt> is a recurrent layer that forwards features from 
previous words to next words. Finally, <tt>RnnlmComputationLayer</tt> computes 
the perplexity loss with word class information from 
<tt>RnnlmClassparserLayer</tt>. The word class is a cluster ID. Words are 
clustered based on their frequency in the dataset, e.g., frequent words are 
clustered together and less frequent words are clustered together. Clustering 
is to improve the efficiency of the final prediction process.</p>
+<p><img src="../images/rnnlm.png" align="center" width="400px" alt="" /> 
<span><b>Figure 1 - Net structure of the RNN model.</b></span></p>
+<p>The neural net structure is shown Figure 1. Word records are loaded by 
<tt>DataLayer</tt>. For every iteration, at most <tt>max_window</tt> word 
records are processed. If a sentence ending character is read, the 
<tt>DataLayer</tt> stops loading immediately. <tt>EmbeddingLayer</tt> looks up 
a word embedding matrix to extract feature vectors for words loaded by the 
<tt>DataLayer</tt>. These features are transformed by the <tt>HiddenLayer</tt> 
which propagates the features from left to right. The output feature for word 
at position k is influenced by words from position 0 to k-1. Finally, 
<tt>LossLayer</tt> computes the cross-entropy loss (see below) by predicting 
the next word of each word. <tt>LabelLayer</tt> reads the same number of word 
records as the embedding layer but starts from position 1. Consequently, the 
word record at position k in <tt>LabelLayer</tt> is the ground truth for the 
word at position k in <tt>LossLayer</tt>.</p>
+<p>The cross-entropy loss is computed as</p>
+<p><tt>$$L(w_t)=-log P(w_{t+1}|w_t)$$</tt></p>
+<p>Given <tt>$w_t$</tt> the above equation would compute over all words in the 
vocabulary, which is time consuming. <a class="externalLink" 
href="https://f25ea9ccb7d3346ce6891573d543960492b92c30.googledrive.com/host/0ByxdPXuxLPS5RFM5dVNvWVhTd0U/rnnlm-0.4b.tgz";>RNNLM
 Toolkit</a> accelerates the computation as</p>
+<p><tt>$$P(w_{t+1}|w_t) = P(C_{w_{t+1}}|w_t) * 
P(w_{t+1}|C_{w_{t+1}})$$</tt></p>
+<p>Words from the vocabulary are partitioned into a user-defined number of 
classes. The first term on the left side predicts the class of the next word, 
and then predicts the next word given its class. Both the number of classes and 
the words from one class are much smaller than the vocabulary size. The 
probabilities can be calculated much faster.</p>
+<p>The perplexity per word is computed by,</p>
+<p><tt>$$PPL = 10^{- avg_t log_{10} P(w_{t+1}|w_t)}$$</tt></p>
 <div class="section">
 <h3><a name="Data_preparation"></a>Data preparation</h3>
-<p>We use a small dataset in this example. In this dataset, [dataset 
description, e.g., format]. The subsequent steps follow the instructions in <a 
class="externalLink" href="http://singa.incubator.apache.org/docs/data";>Data 
Preparation</a> to convert the raw data into <tt>Record</tt>s and insert them 
into <tt>DataShard</tt>s.</p>
+<p>We use a small dataset provided by the <a class="externalLink" 
href="https://f25ea9ccb7d3346ce6891573d543960492b92c30.googledrive.com/host/0ByxdPXuxLPS5RFM5dVNvWVhTd0U/rnnlm-0.4b.tgz";>RNNLM
 Toolkit</a>. It has 10,000 training sentences, with 71350 words in total and 
3720 unique words. The subsequent steps follow the instructions in <a 
href="data.html">Data Preparation</a> to convert the raw data into records and 
insert them into <tt>DataShard</tt>s.</p>
 <div class="section">
 <h4><a name="Download_source_data"></a>Download source data</h4>
 
 <div class="source">
-<div class="source"><pre class="prettyprint"># in SINGA_ROOT/examples/rnn/
-wget http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz
-xxx
+<div class="source"><pre class="prettyprint"># in SINGA_ROOT/examples/rnnlm/
+cp Makefile.example Makefile
+make download
 </pre></div></div></div>
 <div class="section">
-<h4><a name="Define_your_own_record."></a>Define your own record.</h4>
-<p>Since this dataset has different format as the built-in 
<tt>SingleLabelImageRecord</tt>, we need to extend the base <tt>Record</tt> to 
add new fields,</p>
+<h4><a name="Define_your_own_record"></a>Define your own record</h4>
+<p>We define the word record as follows,</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint"># in 
SINGA_ROOT/examples/rnn/user.proto
-package singa;
-
-import &quot;common.proto&quot;;  // import SINGA Record
-
-extend Record {  // extend base Record to include users' records
-    optional WordClassRecord wordclass = 101;
-    optional SingleWordRecord singleword = 102;
-}
-
-message WordClassRecord {
-    optional int32 class_index = 1; // the index of this class
-    optional int32 start = 2; // the index of the start word in this class;
-    optional int32 end = 3; // the index of the end word in this class
+<div class="source"><pre class="prettyprint"># in 
SINGA_ROOT/examples/rnnlm/rnnlm.proto
+message WordRecord {
+  optional string word = 1;
+  optional int32 word_index = 2;
+  optional int32 class_index = 3;
+  optional int32 class_start = 4;
+  optional int32 class_end = 5;
 }
 
-message SingleWordRecord {
-    optional string word = 1;
-    optional int32 word_index = 2;   // the index of this word in the 
vocabulary
-    optional int32 class_index = 3;   // the index of the class corresponding 
to this word
+extend singa.Record {
+  optional WordRecord word = 101;
 }
-</pre></div></div></div>
-<div class="section">
-<h4><a name="Create_data_shard_for_training_and_testing"></a>Create data shard 
for training and testing</h4>
-<p>{% comment %} As the vocabulary size is very large, the original perplexity 
calculation method is time consuming. Because it has to calculate the 
probabilities of all possible words for</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">p(wt|w0, w1, ... wt-1).
 </pre></div></div>
-<p>Tomas proposed to divide all words into different classes according to the 
word frequency, and compute the perplexity according to</p>
+<p>It includes the word string and its index in the vocabulary. Words in the 
vocabulary are sorted based on their frequency in the training dataset. The 
sorted list is cut into 100 sublists such that each sublist has 1/100 total 
word frequency. Each sublist is called a class. Hence each word has a 
<tt>class_index</tt> ([0,100)). The <tt>class_start</tt> is the index of the 
first word in the same class as <tt>word</tt>. The <tt>class_end</tt> is the 
index of the first word in the next class.</p></div>
+<div class="section">
+<h4><a name="Create_DataShards"></a>Create DataShards</h4>
+<p>We use code from RNNLM Toolkit to read words, and sort them into classes. 
The main function in <i>create_shard.cc</i> first creates word classes based on 
the training dataset. Second it calls the following function to create data 
shards for the training, validation and test dataset.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">p(wt|w0, w1, ... wt-1) = 
p(c|w0,w1,..wt-1) p(w|c)
+<div class="source"><pre class="prettyprint">int create_shard(const char 
*input_file, const char *output_file);
 </pre></div></div>
-<p>where <tt>c</tt> is the word class, <tt>w0, w1...wt-1</tt> are the previous 
words before <tt>wt</tt>. The probabilities on the right side can be computed 
faster than</p>
-<p><a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/examples/rnnlm/Makefile";>Makefile</a>
 for creating the shards (see in  <a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/examples/rnnlm/create_shard.cc";>create_shard.cc</a>),
  we need to specify where to download the source data, number of classes we  
want to divide all occurring words into, and all the shards together with  
their names, directories we want to create. {% endcomment %}</p>
-<p><i>SINGA_ROOT/examples/rnn/create_shard.cc</i> defines the following 
function for creating data shards,</p>
+<p><tt>input</tt> is the path to training/validation/testing text file from 
the RNNLM Toolkit, <tt>output</tt> is output shard folder. This function starts 
with</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void create_shard(const char 
*input, int nclass) {
+<div class="source"><pre class="prettyprint">DataShard dataShard(output, 
DataShard::kCreate);
 </pre></div></div>
-<p><tt>input</tt> is the path to [the text file], <tt>nclass</tt> is user 
specified cluster size. This function starts with</p>
+<p>Then it reads the words one by one. For each word it creates a 
<tt>WordRecord</tt> instance, and inserts it into the <tt>dataShard</tt>.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">  using StrIntMap = 
std::map&lt;std::string, int&gt;;
-  StrIntMap *wordIdxMapPtr; //  Mapping word string to a word index
-  StrIntMap *wordClassIdxMapPtr;    // Mapping word string to a word class 
index
-  if (-1 == nclass) {
-      loadClusterForNonTrainMode(input, nclass, &amp;wordIdxMap, 
&amp;wordClassIdxMap); // non-training phase
-  } else {
-      doClusterForTrainMode(input, nclass, &amp;wordIdxMap, 
&amp;wordClassIdxMap); // training phase
-  }
-</pre></div></div>
-
-<ul>
-  
-<li>If <tt>-1 == nclass</tt>, <tt>path</tt> points to the training data file. 
<tt>doClusterForTrainMode</tt>  reads all the words in the file to create the 
two maps. [The two maps are stored in xxx]</li>
-  
-<li>otherwise, <tt>path</tt> points to either test or validation data file. 
<tt>loadClusterForNonTrainMode</tt>  loads the two maps from [xxx].</li>
-</ul>
-<p>Words from training/text/validation files are converted into 
<tt>Record</tt>s by</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">  singa::SingleWordRecord 
*wordRecord = record.MutableExtension(singa::singleword);
-  while (in &gt;&gt; word) {
-    wordRecord-&gt;set_word(word);
-    wordRecord-&gt;set_word_index(wordIdxMap[word]);
-    wordRecord-&gt;set_class_index(wordClassIdxMap[word]);
-    snprintf(key, kMaxKeyLength, &quot;%08d&quot;, wordIdxMap[word]);
-    wordShard.Insert(std::string(key), record);
-  }
+<div class="source"><pre class="prettyprint">int wcnt = 0; // word count
+singa.Record record;
+WordRecord* wordRecord = record.MutableExtension(word);
+while(1) {
+  readWord(wordstr, fin);
+  if (feof(fin)) break;
+  ...// fill in the wordRecord;
+  int length = snprintf(key, BUFFER_LEN, &quot;%05d&quot;, wcnt++);
+  dataShard.Insert(string(key, length), record);
 }
 </pre></div></div>
 <p>Compilation and running commands are provided in the 
<i>Makefile.example</i>. After executing</p>
@@ -598,306 +586,261 @@ message SingleWordRecord {
 <div class="source">
 <div class="source"><pre class="prettyprint">make create
 </pre></div></div>
-<p>, three data shards will created using the <tt>create_shard.cc</tt>, 
namely, <i>rnnlm_word_shard_train</i>, <i>rnnlm_word_shard_test</i> and 
<i>rnnlm_word_shard_valid</i>.</p></div></div>
+<p>, three data shards will created, namely, <i>train_shard</i>, 
<i>test_shard</i> and <i>valid_shard</i>.</p></div></div>
 <div class="section">
 <h3><a name="Layer_implementation"></a>Layer implementation</h3>
-<p>7 layers (i.e., Layer subclasses) are implemented for this application, 
including 1 <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#data-layers";>data layer</a> 
which fetches data records from data shards, 2 <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#parser-layers";>parser 
layers</a> which parses the input records, 3 neuron layers which transforms the 
word features and 1 loss layer which computes the objective loss.</p>
-<p>First, we illustrate the data shard and how to create it for this 
application. Then, we discuss the configuration and functionality of layers. 
Finally, we introduce how to configure a job and then run the training for your 
own model.</p>
-<p>Following the guide for implementing <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/layer#implementing-a-new-layer-subclass";>new
 Layer subclasses</a>, we extend the <a class="externalLink" 
href="http://singa.incubator.apache.org/api/classsinga_1_1LayerProto.html";>LayerProto</a>
 to include the configuration message of each user-defined layer as shown below 
(5 out of the 7 layers have specific configurations),</p>
+<p>6 user-defined layers are implemented for this application. Following the 
guide for implementing <a href="layer#implementing-a-new-layer-subclass">new 
Layer subclasses</a>, we extend the <a 
href="../api/classsinga_1_1LayerProto.html">LayerProto</a> to include the 
configuration messages of user-defined layers as shown below (3 out of the 7 
layers have specific configurations),</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">package singa;
-
-import &quot;common.proto&quot;;  // Record message for SINGA is defined
-import &quot;job.proto&quot;;     // Layer message for SINGA is defined
+<div class="source"><pre class="prettyprint">import &quot;job.proto&quot;;     
// Layer message for SINGA is defined
 
 //For implementation of RNNLM application
-extend LayerProto {
-    optional RnnlmComputationProto rnnlmcomputation_conf = 201;
-    optional RnnlmSigmoidProto rnnlmsigmoid_conf = 202;
-    optional RnnlmInnerproductProto rnnlminnerproduct_conf = 203;
-    optional RnnlmWordinputProto rnnlmwordinput_conf = 204;
-    optional RnnlmDataProto rnnlmdata_conf = 207;
+extend singa.LayerProto {
+  optional EmbeddingProto embedding_conf = 101;
+  optional LossProto loss_conf = 102;
+  optional InputProto input_conf = 103;
 }
 </pre></div></div>
-<p>In the subsequent sections, we describe the implementation of each layer, 
including it configuration message.</p></div>
-<div class="section">
-<h3><a name="RnnlmDataLayer"></a>RnnlmDataLayer</h3>
-<p>It inherits <a href="/api/classsinga_1_1DataLayer.html">DataLayer</a> for 
loading word and class <tt>Record</tt>s from <tt>DataShard</tt>s into 
memory.</p>
+<p>In the subsequent sections, we describe the implementation of each layer, 
including its configuration message.</p>
 <div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
+<h4><a name="RNNLayer"></a>RNNLayer</h4>
+<p>This is the base layer of all other layers for this applications. It is 
defined as follows,</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void RnnlmDataLayer::Setup() {
-  read records from ClassShard to construct mapping from word string to class 
index
-  Resize length of records_ as window_size + 1
-  Read 1st word record to the last position
-}
-
-
-void RnnlmDataLayer::ComputeFeature() {
-    records_[0] = records_[windowsize_];    //Copy the last record to 1st 
position in the record vector
-  Assign values to records_;    //Read window_size new word records from 
WordShard
-}
+<div class="source"><pre class="prettyprint">class RNNLayer : virtual public 
Layer {
+public:
+  inline int window() { return window_; }
+protected:
+  int window_;
+};
 </pre></div></div>
-<p>The <tt>Steup</tt> function load the mapping (from word string to class 
index) from <i>ClassShard</i>.</p>
-<p>Every time the <tt>ComputeFeature</tt> function is called, it loads 
<tt>windowsize_</tt> records from <tt>WordShard</tt>.</p>
-<p>[For the consistency of operations at each training iteration, it maintains 
a record vector (length of window_size + 1). It reads the 1st record from the 
WordShard and puts it in the last position of record vector].</p></div>
+<p>For this application, two iterations may process different number of words. 
Because sentences have different lengths. The <tt>DataLayer</tt> decides the 
effective window size. All other layers call its source layers to get the 
effective window size and resets <tt>window_</tt> in <tt>ComputeFeature</tt> 
function.</p></div>
 <div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
+<h4><a name="DataLayer"></a>DataLayer</h4>
+<p>DataLayer is for loading Records.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">message RnnlmDataProto {
-    required string class_path = 1;   // path to the class data file/folder, 
absolute or relative to the workspace
-    required string word_path = 2;    // path to the word data file/folder, 
absolute or relative to the workspace
-    required int32 window_size = 3;   // window size.
-}
+<div class="source"><pre class="prettyprint">class DataLayer : public 
RNNLayer, singa::DataLayer {
+ public:
+  void Setup(const LayerProto&amp; proto, int npartitions) override;
+  void ComputeFeature(int flag, Metric *perf) override;
+  int max_window() const {
+    return max_window_;
+  }
+ private:
+  int max_window_;
+  singa::DataShard* shard_;
+};
 </pre></div></div>
-<p>[class_path to file or folder?]</p>
-<p>[There two paths, <tt>class_path</tt> for &#x2026;; <tt>word_path</tt> 
for.. The <tt>window_size</tt> is set to &#x2026;]</p></div></div>
-<div class="section">
-<h3><a name="RnnlmWordParserLayer"></a>RnnlmWordParserLayer</h3>
-<p>This layer gets <tt>window_size</tt> word strings from the 
<tt>RnnlmDataLayer</tt> and looks up the word string to word index map to get 
word indexes.</p>
-<div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
+<p>The Setup function gets the user configured max window size. Since this 
application predicts the next word for each input word, the record vector is 
resized to have max_window+1 records, where the k-th record is loaded as the 
ground truth label for the (k-1)-th record.</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">max_window_ = 
proto.GetExtension(input_conf).max_window();
+records_.resize(max_window_ + 1);
+</pre></div></div>
+<p>The <tt>ComputeFeature</tt> function loads at most max_window records. It 
could also stop when the sentence ending character is encountered.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmWordparserLayer::Setup(){
-    Obtain window size from src layer;
-    Obtain vocabulary size from src layer;
-    Reshape data_ as {window_size};
+<div class="source"><pre class="prettyprint">records_[0] = records_[window_]; 
// shift the last record to the first
+window_ = max_window_;
+for (int i = 1; i &lt;= max_window_; i++) {
+  // load record; break if it is the ending character
 }
+</pre></div></div>
+<p>The configuration of <tt>DataLayer</tt> is like</p>
 
-void RnnlmWordparserLayer::ParseRecords(Blob* blob){
-  for each word record in the window, get its word index and insert the index 
into blob
+<div class="source">
+<div class="source"><pre class="prettyprint">name: &quot;data&quot;
+user_type: &quot;kData&quot;
+[input_conf] {
+  path: &quot;examples/rnnlm/train_shard&quot;
+  max_window: 10
 }
 </pre></div></div></div>
 <div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>This layer does not have specific configuration fields.</p></div></div>
-<div class="section">
-<h3><a name="RnnlmClassParserLayer"></a>RnnlmClassParserLayer</h3>
-<p>It maps each word in the processing window into a class index.</p>
-<div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
+<h4><a name="EmbeddingLayer"></a>EmbeddingLayer</h4>
+<p>This layer gets records from <tt>DataLayer</tt>. For each record, the word 
index is parsed and used to get the corresponding word feature vector from the 
embedding matrix.</p>
+<p>The class is declared as follows,</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmClassparserLayer::Setup(){
-  Obtain window size from src layer;
-  Obtain vocaubulary size from src layer;
-  Obtain class size from src layer;
-  Reshape data_ as {windowsize_, 4};
-}
-
-void RnnlmClassparserLayer::ParseRecords(){
-  for(int i = 1; i &lt; records.size(); i++){
-      Copy starting word index in this class to data[i]'s 1st position;
-      Copy ending word index in this class to data[i]'s 2nd position;
-      Copy index of input word to data[i]'s 3rd position;
-      Copy class index of input word to data[i]'s 4th position;
+<div class="source"><pre class="prettyprint">class EmbeddingLayer : public 
RNNLayer {
+  ...
+  const std::vector&lt;Param*&gt; GetParams() const override {
+    std::vector&lt;Param*&gt; params{embed_};
+    return params;
   }
+ private:
+  int word_dim_, vocab_size_;
+  Param* embed_;
 }
 </pre></div></div>
-<p>The setup function read</p></div>
-<div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>This layer fetches the class information (the mapping information between 
classes and words) from RnnlmDataLayer and maintains this information as data 
in this layer.</p>
-<p>Next, this layer parses the last &#x201c;window_size&#x201d; number of word 
records from RnnlmDataLayer and stores them as data. Then, it retrieves the 
corresponding class for each input word. It stores the starting word index of 
this class, ending word index of this class, word index and class index 
respectively.</p></div></div>
-<div class="section">
-<h3><a name="RnnlmWordInputLayer"></a>RnnlmWordInputLayer</h3>
-<p>Using the input word records, this layer obtains corresponding word vectors 
as its data. Then, it passes the data to RnnlmInnerProductLayer above for 
further processing.</p>
-<div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>In this layer, the length of each word vector needs to be configured. 
Besides, whether to use bias term during the training process should also be 
configured (See more in <a class="externalLink" 
href="https://github.com/kaiping/incubator-singa/blob/rnnlm/src/proto/job.proto";>job.proto</a>).</p>
+<p>The <tt>embed_</tt> field is a matrix whose values are parameter to be 
learned. The matrix size is <tt>vocab_size_</tt> x <tt>word_dim_</tt>.</p>
+<p>The Setup function reads configurations for <tt>word_dim_</tt> and 
<tt>vocab_size_</tt>. Then it allocates feature Blob for <tt>max_window</tt> 
words and setups <tt>embed_</tt>.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">message RnnlmWordinputProto {
-    required int32 word_length = 1;  // vector length for each input word
-    optional bool bias_term = 30 [default = true];  // use bias vector or not
-}
-</pre></div></div></div>
-<div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
-<p>In setup phase, this layer first reshapes its members such as 
&#x201c;data&#x201d;, &#x201c;grad&#x201d;, and &#x201c;weight&#x201d; matrix. 
Then, it obtains the vocabulary size from its source layer (i.e., 
RnnlmWordParserLayer).</p>
-<p>In the forward phase, using the &#x201c;window_size&#x201d; number of input 
word indices, the &#x201c;window_size&#x201d; number of word vectors are 
selected from this layer&#x2019;s weight matrix, each word index corresponding 
to one row.</p>
+<div class="source"><pre class="prettyprint">int max_window = 
srclayers_[0]-&gt;data(this).shape()[0];
+word_dim_ = proto.GetExtension(embedding_conf).word_dim();
+data_.Reshape(vector&lt;int&gt;{max_window, word_dim_});
+...
+embed_-&gt;Setup(vector&lt;int&gt;{vocab_size_, word_dim_});
+</pre></div></div>
+<p>The <tt>ComputeFeature</tt> function simply copies the feature vector from 
the <tt>embed_</tt> matrix into the feature Blob.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmWordinputLayer::ComputeFeature() {
-    for(int t = 0; t &lt; windowsize_; t++){
-        data[t] = weight[src[t]];
-    }
+<div class="source"><pre class="prettyprint"># reset effective window size
+window_ = datalayer-&gt;window();
+auto records = datalayer-&gt;records();
+...
+for (int t = 0; t &lt; window_; t++) {
+  int idx = static_cast&lt;int&gt;(records[t].GetExtension(word).word_index());
+  Copy(words[t], embed[idx]);
 }
 </pre></div></div>
-<p>In the backward phase, after computing this layer&#x2019;s gradient in its 
destination layer (i.e., RnnlmInnerProductLayer), here the gradient of the 
weight matrix in this layer is copied (by row corresponding to word indices) 
from this layer&#x2019;s gradient.</p>
+<p>The <tt>ComputeGradient</tt> function copies back the gradients to the 
<tt>embed_</tt> matrix.</p>
+<p>The configuration for <tt>EmbeddingLayer</tt> is like,</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmWordinputLayer::ComputeGradient() {
-    for(int t = 0; t &lt; windowsize_; t++){
-        gweight[src[t]] = grad[t];
-    }
+<div class="source"><pre class="prettyprint">user_type: &quot;kEmbedding&quot;
+[embedding_conf] {
+  word_dim: 15
+  vocab_size: 3720
 }
-</pre></div></div></div></div>
-<div class="section">
-<h3><a name="RnnlmInnerProductLayer"></a>RnnlmInnerProductLayer</h3>
-<p>This is a neuron layer which receives the data from RnnlmWordInputLayer and 
sends the computation results to RnnlmSigmoidLayer.</p>
-<div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>In this layer, the number of neurons needs to be specified. Besides, 
whether to use a bias term should also be configured.</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">message RnnlmInnerproductProto {
-    required int32 num_output = 1;  //Number of outputs for the layer
-    optional bool bias_term = 30 [default = true];  //Use bias vector or not
+srclayers: &quot;data&quot;
+param {
+  name: &quot;w1&quot;
+  init {
+    type: kUniform
+    low:-0.3
+    high:0.3
+  }
 }
 </pre></div></div></div>
 <div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
-<p>In the forward phase, this layer is in charge of executing the dot 
multiplication between its weight matrix and the data in its source layer 
(i.e., RnnlmWordInputLayer).</p>
+<h4><a name="LabelLayer"></a>LabelLayer</h4>
+<p>Since the label of records[i] is records[i+1]. This layer fetches the 
effective window records starting from position 1. It converts each record into 
a tuple (word_class_start, word_class_end, word_index, class_index).</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmInnerproductLayer::ComputeFeature() {
-    data = dot(src, weight);    //Dot multiplication operation
+<div class="source"><pre class="prettyprint">for (int i = 0; i &lt; window_; 
i++) {
+  WordRecord wordrecord = records[i + 1].GetExtension(word);
+  label[4 * i + 0] = wordrecord.class_start();
+  label[4 * i + 1] = wordrecord.class_end();
+  label[4 * i + 2] = wordrecord.word_index();
+  label[4 * i + 3] = wordrecord.class_index();
 }
 </pre></div></div>
-<p>In the backward phase, this layer needs to first compute the gradient of 
its source layer (i.e., RnnlmWordInputLayer). Then, it needs to compute the 
gradient of its weight matrix by aggregating computation results for each 
timestamp. The details can be seen as follows.</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmInnerproductLayer::ComputeGradient() {
-    for (int t = 0; t &lt; windowsize_; t++) {
-        Add the dot product of src[t] and grad[t] to gweight;
-    }
-    Copy the dot product of grad and weight to gsrc;
-}
-</pre></div></div></div></div>
-<div class="section">
-<h3><a name="RnnlmSigmoidLayer"></a>RnnlmSigmoidLayer</h3>
-<p>This is a neuron layer for computation. During the computation in this 
layer, each component of the member data specific to one timestamp uses its 
previous timestamp&#x2019;s data component as part of the input. This is how 
the time-order information is utilized in this language model application.</p>
-<p>Besides, if you want to implement a recurrent neural network following our 
design, this layer is of vital importance for you to refer to. Also, you can 
always think of other design methods to make use of information from past 
timestamps.</p>
+<p>There is no special configuration for this layer.</p></div>
 <div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>In this layer, whether to use a bias term needs to be specified.</p>
+<h4><a name="HiddenLayer"></a>HiddenLayer</h4>
+<p>This layer unrolls the recurrent connections for at most max_window times. 
The feature for position k is computed based on the feature from the embedding 
layer (position k) and the feature at position k-1 of this layer. The formula 
is</p>
+<p><tt>$$f[k]=\sigma (f[t-1]*W+src[t])$$</tt></p>
+<p>where <tt>$W$</tt> is a matrix with <tt>word_dim_</tt> x <tt>word_dim_</tt> 
parameters.</p>
+<p>If you want to implement a recurrent neural network following our design, 
this layer is of vital importance for you to refer to.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">message RnnlmSigmoidProto {
-    optional bool bias_term = 1 [default = true];  // use bias vector or not
-}
-</pre></div></div></div>
-<div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
-<p>In the forward phase, this layer first receives data from its source layer 
(i.e., RnnlmInnerProductLayer) which is used as one part input for computation. 
Then, for each timestampe this layer executes a dot multiplication between its 
previous timestamp information and its own weight matrix. The results are the 
other part for computation. This layer sums these two parts together and 
executes an activation operation. The detailed descriptions for this process 
are illustrated as follows.</p>
+<div class="source"><pre class="prettyprint">class HiddenLayer : public 
RNNLayer {
+  ...
+  const std::vector&lt;Param*&gt; GetParams() const override {
+    std::vector&lt;Param*&gt; params{weight_};
+    return params;
+  }
+private:
+  Param* weight_;
+};
+</pre></div></div>
+<p>The <tt>Setup</tt> function setups the weight matrix as</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmSigmoidLayer::ComputeFeature() {
-    for(int t = 0; t &lt; window_size; t++){
-        if(t == 0) Copy the sigmoid results of src[t] to data[t];
-        else Compute the dot product of data[t - 1] and weight, and add 
sigmoid results of src[t] to be data[t];
-   }
-}
+<div class="source"><pre 
class="prettyprint">weight_-&gt;Setup(std::vector&lt;int&gt;{word_dim, 
word_dim});
 </pre></div></div>
-<p>In the backward phase, this RnnlmSigmoidLayer first updates this 
layer&#x2019;s member grad utilizing the information from current 
timestamp&#x2019;s next timestamp. Then respectively, this layer computes the 
gradient for its weight matrix and its source layer RnnlmInnerProductLayer by 
iterating different timestamps. The process can be seen below.</p>
+<p>The <tt>ComputeFeature</tt> function gets the effective window size 
(<tt>window_</tt>) from its source layer i.e., the embedding layer. Then it 
propagates the feature from position 0 to position <tt>window_</tt> -1. The 
detailed descriptions for this process are illustrated as follows.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmSigmoidLayer::ComputeGradient(){
-    Update grad[t]; // Update the gradient for the current layer, add a new 
term from next timestamp
-    for (int t = 0; t &lt; windowsize_; t++) {
-            Update gweight; // Compute the gradient for the weight matrix
-            Compute gsrc[t];    // Compute the gradient for src layer
-    }
+<div class="source"><pre class="prettyprint">void 
HiddenLayer::ComputeFeature() {
+  for(int t = 0; t &lt; window_size; t++){
+    if(t == 0)
+      Copy(data[t], src[t]);
+    else
+      data[t]=sigmoid(data[t-1]*W + src[t]);
+  }
 }
-</pre></div></div></div></div>
-<div class="section">
-<h3><a name="RnnlmComputationLayer"></a>RnnlmComputationLayer</h3>
-<p>This layer is a loss layer in which the performance metrics, both the 
probability of predicting the next word correctly, and perplexity (PPL in 
short) are computed. To be specific, this layer is composed of the class 
information part and the word information part. Therefore, the computation can 
be essentially divided into two parts by slicing this layer&#x2019;s weight 
matrix.</p>
-<div class="section">
-<h4><a name="Configuration"></a>Configuration</h4>
-<p>In this layer, it is needed to specify whether to use a bias term during 
training.</p>
+</pre></div></div>
+<p>The <tt>ComputeGradient</tt> function computes the gradient of the loss 
w.r.t. W and the source layer. Particularly, for each position k, since data[k] 
contributes to data[k+1] and the feature at position k in its destination layer 
(the loss layer), grad[k] should contains the gradient from two parts. The 
destination layer has already computed the gradient from the loss layer into 
grad[k]; In the <tt>ComputeGradient</tt> function, we need to add the gradient 
from position k+1.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">message RnnlmComputationProto {
-    optional bool bias_term = 1 [default = true];  // use bias vector or not
+<div class="source"><pre class="prettyprint">void 
HiddenLayer::ComputeGradient(){
+  ...
+  for (int k = window_ - 1; k &gt;= 0; k--) {
+    if (k &lt; window_ - 1) {
+      grad[k] += dot(grad[k + 1], weight.T()); // add gradient from position 
t+1.
+    }
+    grad[k] =... // compute gL/gy[t], y[t]=data[t-1]*W+src[t]
+  }
+  gweight = dot(data.Slice(0, window_-1).T(), grad.Slice(1, window_));
+  Copy(gsrc, grad);
 }
-</pre></div></div></div>
+</pre></div></div>
+<p>After the loop, we get the gradient of the loss w.r.t y[k], which is used 
to compute the gradient of W and the src[k].</p></div>
 <div class="section">
-<h4><a name="Functionality"></a>Functionality</h4>
-<p>In the forward phase, by using the two sliced weight matrices (one is for 
class information, another is for the words in this class), this 
RnnlmComputationLayer calculates the dot product between the source 
layer&#x2019;s input and the sliced matrices. The results can be denoted as 
&#x201c;y1&#x201d; and &#x201c;y2&#x201d;. Then after a softmax function, for 
each input word, the probability distribution of classes and the words in this 
classes are computed. The activated results can be denoted as p1 and p2. Next, 
using the probability distribution, the PPL value is computed.</p>
+<h4><a name="LossLayer"></a>LossLayer</h4>
+<p>This layer computes the cross-entropy loss and the 
<tt>$log_{10}P(w_{t+1}|w_t)$</tt> (which could be averaged over all words by 
users to get the PPL value).</p>
+<p>There are two configuration fields to be specified by users.</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmComputationLayer::ComputeFeature() {
-    Compute y1 and y2;
-    p1 = Softmax(y1);
-    p2 = Softmax(y2);
-    Compute perplexity value PPL;
+<div class="source"><pre class="prettyprint">message LossProto {
+  optional int32 nclass = 1;
+  optional int32 vocab_size = 2;
 }
 </pre></div></div>
-<p>In the backward phase, this layer executes the following three computation 
operations. First, it computes the member gradient of the current layer by each 
timestamp. Second, this layer computes the gradient of its own weight matrix by 
aggregating calculated results from all timestamps. Third, it computes the 
gradient of its source layer, RnnlmSigmoidLayer, timestamp-wise.</p>
+<p>There are two weight matrices to be learned</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">void 
RnnlmComputationLayer::ComputeGradient(){
-    Compute grad[t] for all timestamps;
-    Compute gweight by aggregating results computed in different timestamps;
-    Compute gsrc[t] for all timestamps;
+<div class="source"><pre class="prettyprint">class LossLayer : public RNNLayer 
{
+  ...
+ private:
+  Param* word_weight_, *class_weight_;
 }
-</pre></div></div></div></div></div>
+</pre></div></div>
+<p>The ComputeFeature function computes the two probabilities respectively.</p>
+<p><tt>$$P(C_{w_{t+1}}|w_t) = Softmax(w_t * class\_weight_)$$</tt> 
<tt>$$P(w_{t+1}|C_{w_{t+1}}) = Softmax(w_t * 
word\_weight[class\_start:class\_end])$$</tt></p>
+<p><tt>$w_t$</tt> is the feature from the hidden layer for the k-th word, its 
ground truth next word is <tt>$w_{t+1}$</tt>. The first equation computes the 
probability distribution over all classes for the next word. The second 
equation computes the probability distribution over the words in the ground 
truth class for the next word.</p>
+<p>The ComputeGradient function computes the gradient of the source layer 
(i.e., the hidden layer) and the two weight matrices.</p></div></div>
 <div class="section">
-<h2><a name="Updater_Configuration"></a>Updater Configuration</h2>
-<p>We employ kFixedStep type of the learning rate change method and the 
configuration is as follows. We use different learning rate values in different 
step ranges. <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/updater/";>Here</a> is more 
information about choosing updaters.</p>
+<h3><a name="Updater_Configuration"></a>Updater Configuration</h3>
+<p>We employ kFixedStep type of the learning rate change method and the 
configuration is as follows. We decay the learning rate once the performance 
does not increase on the validation dataset.</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint">updater{
-    #weight_decay:0.0000001
-    lr_change: kFixedStep
-    type: kSGD
+  type: kSGD
+  learning_rate {
+    type: kFixedStep
     fixedstep_conf:{
       step:0
-      step:42810
-      step:49945
-      step:57080
-      step:64215
+      step:48810
+      step:56945
+      step:65080
+      step:73215
       step_lr:0.1
       step_lr:0.05
       step_lr:0.025
       step_lr:0.0125
       step_lr:0.00625
     }
+  }
 }
 </pre></div></div></div>
 <div class="section">
-<h2><a name="TrainOneBatch_Function"></a>TrainOneBatch() Function</h2>
+<h3><a name="TrainOneBatch_Function"></a>TrainOneBatch() Function</h3>
 <p>We use BP (BackPropagation) algorithm to train the RNN model here. The 
corresponding configuration can be seen below.</p>
 
 <div class="source">
 <div class="source"><pre class="prettyprint"># In job.conf file
-alg: kBackPropagation
-</pre></div></div>
-<p>Refer to <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/train-one-batch/";>here</a> 
for more information on different TrainOneBatch() functions.</p></div>
-<div class="section">
-<h2><a name="Cluster_Configuration"></a>Cluster Configuration</h2>
-<p>In this RNN language model, we configure the cluster topology as 
follows.</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">cluster {
-  nworker_groups: 1
-  nserver_groups: 1
-  nservers_per_group: 1
-  nworkers_per_group: 1
-  nservers_per_procs: 1
-  nworkers_per_procs: 1
-  workspace: &quot;examples/rnnlm/&quot;
+train_one_batch {
+  alg: kBackPropagation
 }
-</pre></div></div>
-<p>This is to train the model in one node. For other configuration choices, 
please refer to <a class="externalLink" 
href="http://wangwei-pc.d1.comp.nus.edu.sg:4000/docs/frameworks/";>here</a>.</p></div>
-<div class="section">
-<h2><a name="Configure_Job"></a>Configure Job</h2>
-<p>Job configuration is written in &#x201c;job.conf&#x201d;.</p>
-<p>Note: Extended field names should be embraced with square-parenthesis [], 
e.g., [singa.rnnlmdata_conf].</p></div>
-<div class="section">
-<h2><a name="Run_Training"></a>Run Training</h2>
-<p>Start training by the following commands</p>
-
-<div class="source">
-<div class="source"><pre class="prettyprint">cd SINGA_ROOT
-./bin/singa-run.sh -workspace=examples/rnnlm
 </pre></div></div></div>
+<div class="section">
+<h3><a name="Cluster_Configuration"></a>Cluster Configuration</h3>
+<p>The default cluster configuration can be used, i.e., single worker and 
single server in a single process.</p></div></div>
                   </div>
             </div>
           </div>

Modified: websites/staging/singa/trunk/content/docs/train-one-batch.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/train-one-batch.html (original)
+++ websites/staging/singa/trunk/content/docs/train-one-batch.html Fri Sep 18 
15:11:53 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Train-One-Batch</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           

Modified: websites/staging/singa/trunk/content/docs/updater.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/updater.html (original)
+++ websites/staging/singa/trunk/content/docs/updater.html Fri Sep 18 15:11:53 
2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Updater</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="../js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           
@@ -483,6 +489,7 @@
         <div id="bodyColumn"  class="span10" >
                                   
             <h1>Updater</h1>
+<hr />
 <p>Every server in SINGA has an <a 
href="api/classsinga_1_1Updater.html">Updater</a> instance that updates 
parameters based on gradients. In this page, the <i>Basic user guide</i> 
describes the configuration of an updater. The <i>Advanced user guide</i> 
present details on how to implement a new updater and a new learning rate 
changing method.</p>
 <div class="section">
 <h2><a name="Basic_user_guide"></a>Basic user guide</h2>
@@ -511,7 +518,7 @@
   momentum: float
   weight_decay: float
   learning_rate {
-
+    ...
   }
 }
 </pre></div></div></div>
@@ -662,7 +669,7 @@
 <div class="section">
 <h2><a name="Advanced_user_guide"></a>Advanced user guide</h2>
 <div class="section">
-<h3><a name="Implementing_a_new_Update_subclass"></a>Implementing a new Update 
subclass</h3>
+<h3><a name="Implementing_a_new_Updater_subclass"></a>Implementing a new 
Updater subclass</h3>
 <p>The base Updater class has one virtual function,</p>
 
 <div class="source">
@@ -752,7 +759,7 @@ extend LRGenProto {
 <p>Users have to register this subclass in the main function,</p>
 
 <div class="source">
-<div class="source"><pre class="prettyprint">  
driver.RegisterLRGenerator&lt;FooLRGen&gt;(&quot;FooLR&quot;)
+<div class="source"><pre class="prettyprint">  
driver.RegisterLRGenerator&lt;FooLRGen, std::string&gt;(&quot;FooLR&quot;)
 </pre></div></div></div></div>
                   </div>
             </div>

Modified: websites/staging/singa/trunk/content/index.html
==============================================================================
--- websites/staging/singa/trunk/content/index.html (original)
+++ websites/staging/singa/trunk/content/index.html Fri Sep 18 15:11:53 2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Getting Started</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="./js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           
@@ -489,24 +495,24 @@
 <ul>
   
 <li>
-<p>The <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/overview.html";>Introduction</a> 
page gives an overview of SINGA.</p></li>
+<p>The <a href="docs/overview.html">Introduction</a> page gives an overview of 
SINGA.</p></li>
   
 <li>
-<p>The <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/installation.html";>Installation</a>
 guide describes details on downloading and installing SINGA.</p></li>
+<p>The <a href="docs/installation.html">Installation</a> guide describes 
details on downloading and installing SINGA.</p></li>
   
 <li>
-<p>Please follow the <a class="externalLink" 
href="http://singa.incubator.apache.org/docs/quick-start.html";>Quick Start</a> 
guide to run simple applications on SINGA.</p></li>
+<p>Please follow the <a href="docs/quick-start.html">Quick Start</a> guide to 
run simple applications on SINGA.</p></li>
 </ul></div>
 <div class="section">
 <h3><a name="Documentation"></a>Documentation</h3>
 
 <ul>
   
-<li>Documentations are listed <a class="externalLink" 
href="http://singa.incubator.apache.org/docs.html";>here</a>.</li>
+<li>Documentations are listed <a href="docs.html">here</a>.</li>
   
-<li>Code API can be found <a class="externalLink" 
href="http://singa.incubator.apache.org/api/index.html";>here</a>.</li>
+<li>Code API can be found <a href="api/index.html">here</a>.</li>
   
-<li>Research publication list is available <a class="externalLink" 
href="http://singa.incubator.apache.org/research/publication";>here</a>.</li>
+<li>Research publication list is available <a class="externalLink" 
href="http://www.comp.nus.edu.sg/~dbsystem/singa//research/publication/";>here</a>.</li>
 </ul></div>
 <div class="section">
 <h3><a name="How_to_contribute"></a>How to contribute</h3>
@@ -517,7 +523,7 @@
   
 <li>If you find any issues using SINGA, please report it to the <a 
class="externalLink" href="https://issues.apache.org/jira/browse/singa";>Issue 
Tracker</a>.</li>
   
-<li>You can also contact with <a class="externalLink" 
href="http://singa.incubator.apache.org/dev/community";>SINGA committers</a> 
directly.</li>
+<li>You can also contact with <a href="dev/community">SINGA committers</a> 
directly.</li>
 </ul>
 <p>More details on contributing to SINGA is described <a 
href="dev/contribute">here</a>.</p></div>
 <div class="section">
@@ -525,8 +531,9 @@
 
 <ul>
   
-<li>
-<p>SINGA will be presented at <a class="externalLink" 
href="http://boss.dima.tu-berlin.de/";>BOSS</a> of <a class="externalLink" 
href="http://www.vldb.org/2015/";>VLDB 2015</a> at Hawai&#x2019;i, 4 Sep, 2015. 
(slides: <a href="files/singa-vldb-boss.pptx">overview</a>, <a 
href="files/basic-user-guide.pptx">basic</a>, <a 
href="files/advanced-user-guide.pptx">advanced</a>)</p></li>
+<li>SINGA was presented in a <a class="externalLink" 
href="http://www.comp.nus.edu.sg/~dbsystem/singa/workshop";>workshop on deep 
learning</a> held on 16 Sep, 2015</li>
+  
+<li>SINGA will be presented at <a class="externalLink" 
href="http://boss.dima.tu-berlin.de/";>BOSS</a> of <a class="externalLink" 
href="http://www.vldb.org/2015/";>VLDB 2015</a> at Hawai&#x2019;i, 4 Sep, 2015. 
(slides: <a href="files/singa-vldb-boss.pptx">overview</a>, <a 
href="files/basic-user-guide.pptx">basic</a>, <a 
href="files/advanced-user-guide.pptx">advanced</a>)</li>
   
 <li>
 <p>We will present SINGA at <a class="externalLink" 
href="http://adsc.illinois.edu/contact-us";>ADSC/I2R Deep Learning Workshop</a>, 
25 Aug, 2015.</p></li>
@@ -550,10 +557,10 @@
 <ul>
   
 <li>
-<p>B. C. Ooi, K.-L. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A. 
K. H. Tung, Y. Wang, Z. Xie, M. Zhang, and K. Zheng. <a class="externalLink" 
href="http://www.comp.nus.edu.sg/~ooibc/singaopen-mm15.pdf";>SINGA: A 
distributed deep learning platform</a>. ACM Multimedia  (Open Source Software 
Competition) 2015 (<a class="externalLink" 
href="http://singa.incubator.apache.org/assets/file/bib-oss.txt";>BibTex</a>).</p></li>
+<p>B. C. Ooi, K.-L. Tan, S. Wang, W. Wang, Q. Cai, G. Chen, J. Gao, Z. Luo, A. 
K. H. Tung, Y. Wang, Z. Xie, M. Zhang, and K. Zheng. <a class="externalLink" 
href="http://www.comp.nus.edu.sg/~ooibc/singaopen-mm15.pdf";>SINGA: A 
distributed deep learning platform</a>. ACM Multimedia  (Open Source Software 
Competition) 2015 (<a class="externalLink" 
href="http://www.comp.nus.edu.sg/~dbsystem/singa//assets/file/bib-oss.txt";>BibTex</a>).</p></li>
   
 <li>
-<p>W. Wang, G. Chen, T. T. A. Dinh, B. C. Ooi, K.-L.Tan, J. Gao, and S. Wang. 
<a class="externalLink" 
href="http://www.comp.nus.edu.sg/~ooibc/singa-mm15.pdf";>SINGA:putting deep 
learning in the hands of multimedia users</a>. ACM Multimedia 2015 (<a 
class="externalLink" 
href="http://singa.incubator.apache.org/assets/file/bib-singa.txt";>BibTex</a>).</p></li>
+<p>W. Wang, G. Chen, T. T. A. Dinh, B. C. Ooi, K.-L.Tan, J. Gao, and S. Wang. 
<a class="externalLink" 
href="http://www.comp.nus.edu.sg/~ooibc/singa-mm15.pdf";>SINGA:putting deep 
learning in the hands of multimedia users</a>. ACM Multimedia 2015 (<a 
class="externalLink" 
href="http://www.comp.nus.edu.sg/~dbsystem/singa//assets/file/bib-singa.txt";>BibTex</a>).</p></li>
 </ul></div>
 <div class="section">
 <h3><a name="License"></a>License</h3>

Modified: websites/staging/singa/trunk/content/introduction.html
==============================================================================
--- websites/staging/singa/trunk/content/introduction.html (original)
+++ websites/staging/singa/trunk/content/introduction.html Fri Sep 18 15:11:53 
2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Introduction</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="./js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           

Modified: websites/staging/singa/trunk/content/quick-start.html
==============================================================================
--- websites/staging/singa/trunk/content/quick-start.html (original)
+++ websites/staging/singa/trunk/content/quick-start.html Fri Sep 18 15:11:53 
2015
@@ -1,13 +1,13 @@
 <!DOCTYPE html>
 <!--
- | Generated by Apache Maven Doxia at 2015-09-14 
+ | Generated by Apache Maven Doxia at 2015-09-18 
  | Rendered using Apache Maven Fluido Skin 1.4
 -->
 <html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
   <head>
     <meta charset="UTF-8" />
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
-    <meta name="Date-Revision-yyyymmdd" content="20150914" />
+    <meta name="Date-Revision-yyyymmdd" content="20150918" />
     <meta http-equiv="Content-Language" content="en" />
     <title>Apache SINGA &#x2013; Quick Start</title>
     <link rel="stylesheet" href="./css/apache-maven-fluido-1.4.min.css" />
@@ -20,7 +20,13 @@
   
     <script type="text/javascript" 
src="./js/apache-maven-fluido-1.4.min.js"></script>
 
-    
+                          
+        
+<script 
src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
 type="text/javascript"></script>
+                      
+        
+<script type="text/x-mathjax-config">MathJax.Hub.Config({tex2jax: {inlineMath: 
[['$','$'], ['\\(','\\)']]}});</script>
+          
                   </head>
         <body class="topBarEnabled">
           


Reply via email to