hi, i'm trying to run nutch in our clinicum center and i have a little problem. we have a few intranet servers and i want that nutch skip a few direcotries. for example:
http://sapdoku.ukl.uni-freiburg.de/abteilung/pvs/dokus/ i wrote this urls in the crawl-urlfilter.txt. for example: -^http://([a-z0-9]*\.)*sapdoku.ukl.uni-freiburg.de/abteilung/pvs/dokus but nothing happens. nutch don't skip this urls. and i don't know why... :( kann anyone help me? i'm cwaling with this command: bin/nutch crawl urls -dir crawl060621 -depth 15 &> crawl060621.log & i'm using the release 0.7.1 greets david ========================================================== David Wojciechowski Universitätsklinikum Freiburg Klinikrechenzentrum Agnesenstrasse 6-8 D-79106 Freiburg Telefon : 0761 / 270 - 1842 Fax: 0761 / 270 - 2276 E-Mail : [EMAIL PROTECTED] ==========================================================
