Hi Zabini, I'm a little unclear if you are having a problem with nutch following the links or indexing the pages. Have you tried both of these to verify the links and index data?
https://wiki.apache.org/nutch/bin/nutch%20parsechecker https://wiki.apache.org/nutch/bin/nutch%20indexchecker The second link above seems wrong to me, it shows *IndexingFiltersChecker* but I think it should be *indexchecker*. That works for me. On Wed, Apr 16, 2014 at 11:48 AM, Zabini <[email protected]>wrote: > Hi, > > I am facing a problem with the urls nutch fetch. > > I have a page and whithin several URLs, but Nucth does not fetch them. > They are allowed in the regex-urlfilter and those URLs works fine if I put > them in my urls seed list. > > Does anyone has any hint on what to do? > > Best Regards, > Zabini > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Don-t-fetch-all-urls-in-a-page-tp4131531.html > Sent from the Nutch - User mailing list archive at Nabble.com. >

