Raphael A. Bauer wrote:
i am currently doing a
"nutch crawl urls -dir crawl -depth 10"
- pretty much what is described in the tutorial. and in fact everything
works.
the only problem is that relative links - say <a href="../XYZ">
are not crawled and cannot be searched, what is quite a problem for me.
is there an option i am missing out - or any suggestions how i can fix
this issue?
hi,
just to bring the question up again. i am still searching for a solution
to my problem that the nutch crawl tools does not crawl relative links.
it states:
fetching http://url/+escape(document.referrer)+ and does not investigate
into those html page any further.
so - maybe my question is way too stupid (RTFM - arg.. i read it ;) ),
or the solution is too simple to tell - in either case i really would
appreciate any statement regarding my problem. is there a switch to
enable this? something i've missed?
there is no problem reimplemeting the fetch code - but i don't want to
write the code twice.
thanks again!
ra