Raphael A. Bauer wrote:
i am currently doing a

"nutch crawl urls -dir crawl -depth 10"

- pretty much what is described in the tutorial. and in fact everything works.

the only problem is that relative links - say <a href="../XYZ">
are not crawled and cannot be searched, what is quite a problem for me.

is there an option i am missing out - or any suggestions how i can fix this issue?
hi,

just to bring the question up again. i am still searching for a solution to my problem that the nutch crawl tools does not crawl relative links.

it states:
fetching http://url/+escape(document.referrer)+ and does not investigate into those html page any further.

so - maybe my question is way too stupid (RTFM - arg.. i read it ;) ), or the solution is too simple to tell - in either case i really would appreciate any statement regarding my problem. is there a switch to enable this? something i've missed?

there is no problem reimplemeting the fetch code - but i don't want to write the code twice.

thanks again!

ra





Reply via email to