quick-start.md

wangwei Mon, 15 Jun 2015 01:27:08 -0700

Author: wangwei
Date: Mon Jun 15 08:26:43 2015
New Revision: 1685508

URL: http://svn.apache.org/r1685508
Log:
add instructions for running shared memory Hogwild training framework, i.e., 
training with multiple workers in single node.


Modified:
    incubator/singa/site/trunk/content/markdown/quick-start.md

Modified: incubator/singa/site/trunk/content/markdown/quick-start.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/quick-start.md?rev=1685508&r1=1685507&r2=1685508&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/quick-start.md (original)
+++ incubator/singa/site/trunk/content/markdown/quick-start.md Mon Jun 15 
08:26:43 2015
@@ -92,18 +92,24 @@ Start the training by running:
 
 ##### Training with data Partitioning
 
-There are two cases for data partition:
+    nworker_groups: 2
+    nserver_groups: 1
+    nservers_per_group: 1
+    nworkers_per_group: 1
+    nworkers_per_procs: 2
+    workspace: "examples/cifar10/"
 
-* partition the dataset among worker groups such that one worker group is
- assigned one partition. Groups run asynchronously.
+The above cluster configuration file specifies two worker groups and one 
server group. 
+Worker groups run asynchronously but share the memory space for parameter 
values. In other words,
+it runs as the Hogwild algorithm. Since it is running in a single node, we can 
avoid partitioning the 
+dataset explicitly. In specific, a random start offset is assigned to each 
worker group such that they 
+would not work on the same mini-batch for every iteration. Consequently, they 
run like on different data 
+partitions. The running command is the same:
 
-* partition the neural network among workers within one group. Each layer is
-sliced such that every worker is assigned one sliced layer. The sliced layer is
-the same as the original layer except that it only has B/g feature instances,
-where B is the size of instances in a mini-batch, g is the number of workers in
-a group. All workers run synchronously.
+ ./bin/singa-run.sh -model=examples/cifar10/model.conf 
-cluster=examples/cifar10/cluster.conf
 
-To run the second case with 2 workers, just change the cluster.conf as:
+
+##### Training with model Partitioning
 
     nworker_groups: 1
     nserver_groups: 1
@@ -112,11 +118,18 @@ To run the second case with 2 workers, j
     nworkers_per_procs: 2
     workspace: "examples/cifar10/"
 
+The above cluster configuration specifies one worker group with two workers. 
+The workers run synchronously, i.e., they are synchronized after one iteration.
+The model is partitioned among the two workers. In specific, each layer is
+sliced such that every worker is assigned one sliced layer. The sliced layer is
+the same as the original layer except that it only has B/g feature instances,
+where B is the size of instances in a mini-batch, g is the number of workers in
+a group. 
+ 
 All other settings are the same as running without partitioning
 
     ./bin/singa-run.sh -model=examples/cifar10/model.conf 
-cluster=examples/cifar10/cluster.conf
 
-##### Training with model Partitioning
 
 #### Training in a cluster

svn commit: r1685508 - /incubator/singa/site/trunk/content/markdown/quick-start.md

Reply via email to