quick-start.md

wangwei Wed, 22 Jul 2015 22:56:49 -0700

Author: wangwei
Date: Thu Jul 23 05:53:12 2015
New Revision: 1692349

URL: http://svn.apache.org/r1692349
Log:
update quick start minor change


Modified:
    incubator/singa/site/trunk/content/markdown/quick-start.md

Modified: incubator/singa/site/trunk/content/markdown/quick-start.md
URL: 
http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/quick-start.md?rev=1692349&r1=1692348&r2=1692349&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/quick-start.md (original)
+++ incubator/singa/site/trunk/content/markdown/quick-start.md Thu Jul 23 
05:53:12 2015
@@ -71,7 +71,7 @@ Start the training by running:
     cd ../..
     ./bin/singa-run.sh -workspace=examples/cifar10
 
-Note: we have changed the command line arguments from `-cluster... -model=...`
+Note: we have changed the command line arguments from `-cluster.. -model..`
 to `-workspace`. The `workspace` folder must have a job.conf file which
 specifies the cluster (number of workers, number of servers, etc) and model
 configuration.
@@ -79,13 +79,13 @@ configuration.
 Some training information will be shown on the screen like:
 
     Starting zookeeper ... already running as process 21660.
-    Generate host list to 
/home/singa/wangwei/incubator-singa/examples/cifar10/job.hosts
-    Generate job id to 
/home/singa/wangwei/incubator-singa/examples/cifar10/job.id [job_id = 1]
-    Executing : ./singa 
-workspace=/home/singa/wangwei/incubator-singa/examples/cifar10 -job=1
+    Generate host list to SINGA_ROOT/examples/cifar10/job.hosts
+    Generate job id to SINGA_ROOT/examples/cifar10/job.id [job_id = 1]
+    Executing : ./singa -workspace=SINGA_ROOT/examples/cifar10 -job=1
     proc #0 -> 10.10.10.14:49152 (pid = 26724)
     Server (group = 0, id = 0) start
     Worker (group = 0, id = 0) start
-    Generate pid list to 
/home/singa/wangwei/incubator-singa/examples/cifar10/job.pids
+    Generate pid list to SINGA_ROOT/examples/cifar10/job.pids
     Test step-0, loss : 2.302607, accuracy : 0.090100
     Train step-0, loss : 2.302614, accuracy : 0.062500
     Train step-30, loss : 2.302403, accuracy : 0.141129
@@ -107,6 +107,12 @@ The dumped file can be used for continui
 for other similar models. [Checkpoint and Resume](checkpoint.html) discusses
 more details.
 
+The job can be stopped by
+
+    ./bin/singa-stop.sh
+
+It will kill all singa processes.
+
 <!---
 To train the model without any partitioning, you just set the numbers
 in the cluster configuration file (*cluster.conf*) as :
@@ -148,14 +154,14 @@ To run SINGA in a cluster,
   1. A hostfile should be prepared under conf/ folder, e.g.,
 
         // hostfile
-        logbase-a04
-        logbase-a05
-        logbase-a06
+        singa-node1
+        singa-node2
+        singa-node3
         ...
 
   2. The zookeeper location must be configured in conf/singa.conf, e.g.,
 
-    zookeeper_host: "logbase-a04:2181"
+    zookeeper_host: "singa-node1:2181"
 
   3. Make your ssh command password-free
 
@@ -169,27 +175,18 @@ launch and will generate a job.hosts fil
 all nodes in conf/hostfile. Hence if there are few nodes in the hostfile, then
 multiple processes would be launched in one node.
 
-You can get some job information like job ID and running processes using the
-singa-console.sh script:
-
-    ./bin/singa-console.sh list
-    JOB ID    |NUM PROCS
-    ----------|-----------
-    job-4     |2
-
 Sample training output is
 
-    Generate job id to 
/home/singa/wangwei/incubator-singa/examples/cifar10/job.id [job_id = 4]
-    Executing @ logbase-a04 : cd /home/singa/wangwei/incubator-singa; ./singa 
-workspace=/home/singa/wangwei/incubator-singa/examples/cifar10 -job=4
-    Executing @ logbase-a05 : cd /home/singa/wangwei/incubator-singa; ./singa 
-workspace=/home/singa/wangwei/incubator-singa/examples/cifar10 -job=4
+    Generate job id to SINGA_ROOT/examples/cifar10/job.id [job_id = 4]
+    Executing @ singa-node1: cd SINGA_ROOT; ./singa 
-workspace=SINGA_ROOT/examples/cifar10 -job=4
+    Executing @ singa-node2: cd SINGA_ROOT; ./singa 
-workspace=SINGA_ROOT/examples/cifar10 -job=4
     proc #0 -> 10.10.10.15:49152 (pid = 3504)
     proc #1 -> 10.10.10.14:49152 (pid = 27119)
     Server (group = 0, id = 1) start
     Worker (group = 1, id = 0) start
     Server (group = 0, id = 0) start
     Worker (group = 0, id = 0) start
-    Generate pid list to
-    /home/singa/wangwei/incubator-singa/examples/cifar10/job.pids
+    Generate pid list to SINGA_ROOT/examples/cifar10/job.pids
     Test step-0, loss : 2.297355, accuracy : 0.101700
     Train step-0, loss : 2.274724, accuracy : 0.062500
     Train step-30, loss : 2.263850, accuracy : 0.131048
@@ -208,15 +205,19 @@ Sample training output is
 We can see that the accuracy (resp. loss) from distributed training increases 
(resp.
 decreases) faster than that for the single node training.
 
-You can stop the training by singa-stop.sh
 
-    ./bin/singa-stop.sh
-    Kill singa @ logbase-a04 ...
-    Kill singa @ logbase-a05 ...
-    bash: line 1: 27119 Killed                  ./singa 
-workspace=/home/singa/wangwei/incubator-singa/examples/cifar10 -job=4
-    Kill singa @ logbase-a06 ...
-    bash: line 1:  3504 Killed                  ./singa 
-workspace=/home/singa/wangwei/incubator-singa/examples/cifar10 -job=4
-    Cleanning metadata in zookeeper ...
+You can get some job information like job ID and running processes using the
+singa-console.sh script:
+
+    ./bin/singa-console.sh list
+    JOB ID    |NUM PROCS
+    ----------|-----------
+    job-4     |2
+
+To kill the job, just run
+
+    ./bin/singa-console.sh kill job-4
+
 
 
 <!---

svn commit: r1692349 - /incubator/singa/site/trunk/content/markdown/quick-start.md

Reply via email to