Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "Crawl" page has been changed by cmd.
The comment on this change is: Does anybody can help me translate it to windows 
bat script. thanks. .
http://wiki.apache.org/nutch/Crawl?action=diff&rev1=10&rev2=11

--------------------------------------------------

  The complete job of this script has been divided broadly into 8 steps.
  
   1. Inject URLs
-  2. Generate, Fetch, Parse, Update Loop
+  1. Generate, Fetch, Parse, Update Loop
-  3. Merge Segments
+  1. Merge Segments
-  4. Invert Links
+  1. Invert Links
-  5. Index
+  1. Index
-  6. Dedup
+  1. Dedup
-  7. Merge Indexes
+  1. Merge Indexes
-  8. Load new indexes
+  1. Load new indexes
  
  == Modes of Execution ==
  The script can be executed in two modes:-
+ 
   * Normal Mode
   * Safe Mode
  
@@ -42, +43 @@

  then
    NUTCH_HOME=.
  }}}
- 
  Set 'NUTCH_HOME' to the path of the Nutch directory (if you are not setting 
it as an environment variable, since if environment variable is set, the above 
assignment is ignored).
  
  === CATALINA_HOME ===
@@ -53, +53 @@

  then
    CATALINA_HOME=/opt/apache-tomcat-6.0.10
  }}}
- 
  Similar to the previous section, if this variable is set in the environment, 
then the above assignment is ignored.
  
  == Can it re-crawl? ==
  The author has used this script to re-crawl a couple of times. However, no 
real world testing has been done for re-crawling. Therefore, you may try to use 
the script for re-crawl. If it works fine or it doesn't work properly for 
re-crawl, please let us know.
  
  == Script ==
- {{{
- #!/bin/sh
+ {{{#!/bin/sh
  
  # runbot script to run the Nutch bot for crawling and re-crawling.
  # Usage: bin/runbot [safe]
@@ -90, +88 @@

  then
    NUTCH_HOME=.
    echo runbot: $0 could not find environment variable NUTCH_HOME
-   echo runbot: NUTCH_HOME=$NUTCH_HOME has been set by the script 
+   echo runbot: NUTCH_HOME=$NUTCH_HOME has been set by the script
  else
-   echo runbot: $0 found environment variable NUTCH_HOME=$NUTCH_HOME 
+   echo runbot: $0 found environment variable NUTCH_HOME=$NUTCH_HOME
  fi
  
  if [ -z "$CATALINA_HOME" ]
  then
    CATALINA_HOME=/opt/apache-tomcat-6.0.10
    echo runbot: $0 could not find environment variable NUTCH_HOME
-   echo runbot: CATALINA_HOME=$CATALINA_HOME has been set by the script 
+   echo runbot: CATALINA_HOME=$CATALINA_HOME has been set by the script
  else
-   echo runbot: $0 found environment variable CATALINA_HOME=$CATALINA_HOME 
+   echo runbot: $0 found environment variable CATALINA_HOME=$CATALINA_HOME
  fi
  
  if [ -n "$topN" ]

Reply via email to