-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hello, volks, i did the follwing steps;
1. bin/nutch admin db -create 2. bin/nutch inject db -urlfile urls.txt -- > in my urls.txt are 2 links_/ http://www.termindoc.de http://www.heise.de 3. bin/nutch generate db segments 4. s1=`ls -d segments/2* | tail -1` bin/nutch fetch $s1 5. bin/nutch updatedb db $s1 6. bin/nutch generate db segments s2=`ls -d segments/2* | tail -1` bin/nutch fetch $s2 bin/nutch updatedb db $s2 7. bin/nutch generate db segments s3=`ls -d segments/2* | tail -1` bin/nutch fetch $s3 bin/nutch updatedb db $s3 8. bin/nutch index $s1 bin/nutch index $s2 bin/nutch index $s3 ************************************ my problem, the frist run works fine, sites termindoc.de and heise.de are crawled, but when i put new websites/links to the urls.txt thes new domains are not crawled, what i am doing wrong ???? thx benedikt schackenberg - -- - - -- S&P data GmbH T 06131 218111 F 06131 218112 E [EMAIL PROTECTED] W www.termindoc.de PGP-Key-ID: 0x0D2E4AE4 Unser Impressum finden Sie unter http://www.termindoc.de/Impressum.htm Alle Willenserklärungen der S&P data GmbH bedürfen zu ihrer Wirksamkeit der Schriftform versehen mit zwei Originalunterschriften. Für viele der Dateien, die Sie von uns erhalten, benötigen Sie zum Betrachten den Acrobat Reader, den Sie hier erhalten können. http://www.adobe.de/products/acrobat/readstep2.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEun7odUpiAQ0uSuQRAvfyAJ4wqgoPLwvAWvI+Bh3RCA3kkm3XQQCgpcNH 95saMsdISgoogRvuD29E+YM= =Fnw9 -----END PGP SIGNATURE-----
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
