Adding a set number of inner pages to the fetch list

jjmendes Fri, 21 Oct 2016 12:52:12 -0700

In order to get data for a study, I am currently using Nutch to go
through a list of web pages and download their HTML, said list is solely
comprised of main pages. However, it would be beneficial to also
download at least one other page from the same domain that was linked to
by its home page. Is there any easy way of achieving this?


Thanks,

JJAM

Adding a set number of inner pages to the fetch list

Reply via email to