[ 
https://issues.apache.org/jira/browse/NUTCH-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12530755
 ] 

Chris Schneider commented on NUTCH-558:
---------------------------------------

The reason that DomainStats does not use URLUtils is that (as mentioned above) 
we are currently using a relatively old Nutch source base (last integrated at 
revision 417928). There are probably other tools/resources we could use as well 
if we reworked the code to better fit the current Nutch/Hadooop source 
environment. Sorry for being so out of date.

> Need tool to retrieve domain statistics
> ---------------------------------------
>
>                 Key: NUTCH-558
>                 URL: https://issues.apache.org/jira/browse/NUTCH-558
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 0.9.0
>            Reporter: Chris Schneider
>            Assignee: Chris Schneider
>         Attachments: DomainStats.patch
>
>
> Several developers have expressed interest in a tool to retrieve statistics 
> from a crawl on a domain basis (e.g., how many pages were successfully 
> fetched from www.apache.org vs. apache.org, where the latter total would 
> include the former).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to