Hi all,

I started experimenting with Nutch using the NutchTutorial. I got a
succesful crawl to work using the command 'bin/nutch crawl urls -dir
crawl' (no limitations on depth or number of documents). I noticed
that Nutch finishes quite fast. When I looked in the source-html of
the main page being crawled I noticed that Nutch never followed links
that look like these:

<a href="content.jsp?objectid=22619">Route</a>
<br/>
<a href="content.jsp?objectid=5931">Openingstijden</a>
<br/>

Surely these links look ordinary enough to be seen and followed by
nutch? Could someone please tell me what could be causing these links
not be followed?

Thanks for any help,

Jeroen

Reply via email to