Hi Thanks for the response.
My understanding of the outlink config is that these affect how nutch deals with links embedded in individual crawled pages rather than multiple pages/files in a single URL/directory. Have I got that wrong? P On 5 May 2014 11:07, Tree ser <[email protected]> wrote: > Hi,Paul<br/><br/> maybe u need to check your nutch-site.xml settings. > In nutch-default.xml, nutch just fetch only one outlink from one page, u > need to change the value from true to false. then , u can fetch them. > Furthermore , u can set the outlink limit if u need. <a href=" > https://overview.mail.yahoo.com?.src=iOS"><br/><br/>发自 iPad 版雅虎邮箱</a>

