[ https://issues.apache.org/jira/browse/NUTCH-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lewis John McGibbney resolved NUTCH-1007. ----------------------------------------- Resolution: Won't Fix This is not a problem and as Markus mentioned the DomainStatistics tool does a pretty good job of this already. > Add readdb -host output > ----------------------- > > Key: NUTCH-1007 > URL: https://issues.apache.org/jira/browse/NUTCH-1007 > Project: Nutch > Issue Type: Improvement > Components: generator > Affects Versions: 1.4 > Reporter: MilleBii > Priority: Minor > > I have created an enhancement for the readdb feature, which computes a list > of <host> <nbre of urls for that host>. > I think it could be valuable for many people. This is to know what is in the > crawldb. > Like -dump or -topN the syntax proposed would be like this : readdb -host > ouput -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira