Update of /cvsroot/nutch/nutch/bin
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2281/bin

Modified Files:
        nutch 
Log Message:
Added a new command, crawl, that constructs a database, injects a url
file and performs a few rounds of generate/fetch/updatedb.  This
simplifies use for intranet sites.  Changed some defaults to be
more intranet friendly.

Also fixed a bug where Fetcher.java didn't construct correct relative links
when a page was redirected.


Index: nutch
===================================================================
RCS file: /cvsroot/nutch/nutch/bin/nutch,v
retrieving revision 1.28
retrieving revision 1.29
diff -C2 -d -r1.28 -r1.29
*** nutch       18 Sep 2003 20:02:55 -0000      1.28
--- nutch       21 Apr 2004 22:51:50 -0000      1.29
***************
*** 30,33 ****
--- 30,34 ----
    echo "Usage: nutch COMMAND"
    echo "where COMMAND is one of:"
+   echo "  crawl             one-step crawler for intranets"
    echo "  admin             database administration, including creation"
    echo "  inject            inject new urls into the database"
***************
*** 99,103 ****
  
  # figure out which class to run
! if [ "$COMMAND" = "admin" ] ; then
    CLASS=net.nutch.tools.WebDBAdminTool
  elif [ "$COMMAND" = "inject" ] ; then
--- 100,106 ----
  
  # figure out which class to run
! if [ "$COMMAND" = "crawl" ] ; then
!   CLASS=net.nutch.tools.CrawlTool
! elif [ "$COMMAND" = "admin" ] ; then
    CLASS=net.nutch.tools.WebDBAdminTool
  elif [ "$COMMAND" = "inject" ] ; then



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nutch-cvs mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to