Hi,

I am using apache-nutch-1.9. My configuration ignores external links.

I've some urls in my seed file. But the problem is , nutch crawler doesn't
find the links in those pages because the site popuates content using ajax
call. I've removed all possible regex filters inside conf folder of nutch.

How can I collect those links. Any advice ?
Thanks in advance.

Reply via email to