Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by PaulBaclace:
http://wiki.apache.org/nutch/OverviewDeploymentConfigs

The comment on the change is:
fixed up where rsync happens

------------------------------------------------------------------------------
   1. start_all.sh or stop_all.sh - start and stop whole ensemble.
   2. nutch_daemons.sh - run a Nutch command on all slave hosts.
   3. slaves.sh - run a shell command on all slave hosts.
-  4. nutch_daemon.sh - run a Nutch command as a daemon with a start|stop 
argument like a regular Unix/Linux /etc/rc.local script; the process pid is 
stored during start and used during stop.  Runs rsync at start.
+  4. nutch_daemon.sh - run a Nutch command as a daemon with a start|stop 
argument like a regular Unix/Linux /etc/rc.local script; the process pid is 
stored during start and used during stop.  At start, runs rsync to master 
initiated on slave.
-  5. nutch - run a Nutch command using the JVM.
+  5. nutch - run a Nutch command, specified either as a command name or full 
path to a class, using the JVM.
  
  Depending upon the context of use, any level of these scripts can be handy on 
the command line.
  
@@ -56, +56 @@

  
   A. Cluster deployment with too many machines to customize (probably more 
than 4; 1000 machines should be possible):
  
-   6. bin/slaves.sh rsync-command is used as needed to update jars and conf 
files from master.
-   7. the ensemble starts by running bin/start-all.sh on the master.
+   6. the ensemble starts by running bin/start-all.sh on the master.
-   8. start-all.sh uses bin/nutch-daemons.sh run one datanode process on each 
slave (in the background without waiting, one daemon thread is started per 
comma-separated storage device, non-existent storage devices in the list are 
ignored).
+   7. start-all.sh uses bin/nutch-daemons.sh which uses nutch-daemon.sh to run 
rsync (to update jars and conf files from master) and then run one datanode 
process on each slave (in the background without waiting, one daemon thread is 
started per comma-separated storage device, non-existent storage devices in the 
list are ignored).
-   9. start-all.sh runs one namenode and one jobtracker on the master.
+   8. start-all.sh runs one namenode and one jobtracker on the master.
-   10. start-all.sh uses bin/nutch-daemons.sh run one tasktracker process on 
each slave (in the background without waiting).
+   9. start-all.sh uses bin/nutch-daemons.sh run one tasktracker process on 
each slave (in the background without waiting).
  
  
   B. Cluster of a few machines:

Reply via email to