Re: nutch scrawls only relative links

Denis Pimenov Wed, 24 Jan 2007 07:36:01 -0800

Denis Pimenov пишет:

I used this +^.* in crawl-urlfilter.txt, but it's don't working..itdoesn't crawl relative links, but only absolute...

Hello
I am a newbie in nutch... It seems to me that scrawling is notworking by relative urls by default. How to fix it?
For example i have relative link on start page <ahref="/test/my.jsp"> is not scrawled(but browsers opens in withproper prefix) , but if i have link <ahref="http://mydomain.com:8080/test/my.jsp";> it's crawled well .. Isthere any configuration file or something else to fix that?.. I haveseen such question in mail archive but it wasn't answered
Denis Pimenov

Denis Pimenov

Re: nutch scrawls only relative links

Reply via email to