Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by MatthiasGuenter: http://wiki.apache.org/nutch/FAQ ------------------------------------------------------------------------------ ==== What happens if I inject urls several times? ==== Urls, which are already in the database, won't be injected. + + ==== Java.io.IOException: No input directories specified in: NutchConf: nutch-default.xml , mapred-default.xml ==== + + This really is a crawl tool issue, but is covered here as weel: The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seeds -dir /user/nutchuser... === Fetching === @@ -361, +365 @@ ==== Java.io.IOException: No input directories specified in: NutchConf: nutch-default.xml , mapred-default.xml ==== - The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seed -dir /user/nutchuser... + The crawl tool expects as its first parameter the folder name where the seeding urls file is located so for example if your urls.txt is located in /nutch/seeds the crawl command would look like: crawl seeds -dir /user/nutchuser... === Discussion ===
