Hy

Make sure to inject them. Just putting them in your .txt is not enough.

-----Ursprüngliche Nachricht-----
Von: Schackenberg, Benedikt [mailto:[EMAIL PROTECTED] 
Gesendet: Sonntag, 16. Juli 2006 20:01
An: [email protected]
Betreff: probl. big help me

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hello, volks,
i did the follwing steps;

1.
 bin/nutch admin db -create

2.
 bin/nutch inject db -urlfile urls.txt
  -- > in my urls.txt are 2 links_/ http://www.termindoc.de
                                    http://www.heise.de
3.
 bin/nutch generate db segments

4.
 s1=`ls -d segments/2* | tail -1`

 bin/nutch fetch $s1

5.
 bin/nutch updatedb db $s1

6.

 bin/nutch generate db segments

 s2=`ls -d segments/2* | tail -1`

 bin/nutch fetch $s2

 bin/nutch updatedb db $s2


7.
 bin/nutch generate db segments

 s3=`ls -d segments/2* | tail -1`

 bin/nutch fetch $s3

 bin/nutch updatedb db $s3

8.
 bin/nutch index $s1

 bin/nutch index $s2

 bin/nutch index $s3


************************************

my problem, the frist run works fine, sites termindoc.de and heise.de
are crawled,

but when i put new websites/links to the urls.txt thes new domains are
not crawled, what i am doing wrong ????

thx
benedikt schackenberg


- --
- - --
S&P data GmbH
T 06131 218111
F 06131 218112
E [EMAIL PROTECTED]
W www.termindoc.de

PGP-Key-ID: 0x0D2E4AE4

Unser Impressum finden Sie unter http://www.termindoc.de/Impressum.htm

Alle Willenserklärungen der S&P data GmbH bedürfen zu ihrer Wirksamkeit
der Schriftform versehen mit zwei Originalunterschriften.

Für viele der Dateien, die Sie von uns erhalten, benötigen Sie zum
Betrachten den Acrobat Reader, den Sie hier erhalten können.
http://www.adobe.de/products/acrobat/readstep2.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEun7odUpiAQ0uSuQRAvfyAJ4wqgoPLwvAWvI+Bh3RCA3kkm3XQQCgpcNH
95saMsdISgoogRvuD29E+YM=
=Fnw9
-----END PGP SIGNATURE-----



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to