[ https://issues.apache.org/jira/browse/NUTCH-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma closed NUTCH-182. ------------------------------- Resolution: Won't Fix Bulk close of legacy issues: http://www.lucidimagination.com/search/document/2738eeb014805854/clean_up_open_legacy_issues_in_jira > Log when db.max configuration limits reached > -------------------------------------------- > > Key: NUTCH-182 > URL: https://issues.apache.org/jira/browse/NUTCH-182 > Project: Nutch > Issue Type: Improvement > Components: fetcher > Affects Versions: 0.8 > Reporter: Matt Kangas > Priority: Trivial > Attachments: LinkDb.java.patch, ParseData.java.patch > > > Followup to http://www.nabble.com/Re%3A-Can%27t-index-some-pages-p2480833.html > There are three "db.max" parameters currently in nutch-default.xml: > * db.max.outlinks.per.page > * db.max.anchor.length > * db.max.inlinks > Having values that are too low can result in a site being under-crawled. > However, currently there is nothing written to the log when these limits are > hit, so users have to guess when they need to raise these values. > I suggest that we add three new log messages at the appropriate points: > * "Exceeded db.max.outlinks.per.page for URL " > * "Exceeded db.max.anchor.length for URL " > * "Exceeded db.max.inlinks for URL " -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira