Michael Joyce created NUTCH-2115:
------------------------------------
Summary: Add total counts to dump stats
Key: NUTCH-2115
URL: https://issues.apache.org/jira/browse/NUTCH-2115
Project: Nutch
Issue Type: Improvement
Components: dumpers, util
Affects Versions: 1.10
Reporter: Michael Joyce
Priority: Minor
Fix For: 1.11
It would be nice if the "dump" tool included total counts for the mimetype
stats that it gives. Something along the lines of the following would be great
when you have to deal with some larger crawls and don't want to bother doing
the math yourself.
{code}
Dumper File Stats:
TOTAL Stats:
[
{"mimeType":"application/xhtml+xml","count":"2"}
{"mimeType":"application/octet-stream","count":"1"}
{"mimeType":"text/html","count":"23"}
]
Total count: 26
FILTERED Stats:
[
{"mimeType":"text/html","count":"23"}
]
Total filtered count: 23
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)