I had this same problem - just used depth 1 to fetch & index the injected urls. As far as refetching goes - you might have use "updatedb" with "-noAdditions". Another solution (not that efficient, but works) - restart the crawl process with the same injected urls and discard the old dir/segment.
Nicolás Lichtmaier wrote: > > I'd like to limit nutch to fetch, refetch and index just the injected > URLs. Will setting db.max.outlinks.per.page to 0 enable me to do that? > If not... how could achive what I'm looking to? > > Thanks! > > > -- View this message in context: http://www.nabble.com/How-to-limit-nutch-to-fetch%2C-refetch-and-index-just-the-injected-URLs--tf3125440.html#a13635185 Sent from the Nutch - User mailing list archive at Nabble.com.
