Output from ParserChecker Url missing a newline
-----------------------------------------------
Key: NUTCH-1209
URL: https://issues.apache.org/jira/browse/NUTCH-1209
Project: Nutch
Issue Type: Bug
Components: parser
Affects Versions: 1.4
Environment: While testing this:
http://www.mail-archive.com/[email protected]/msg04688.html
Reporter: Chris A. Mattmann
Assignee: Chris A. Mattmann
Priority: Trivial
Fix For: 1.5
While working on:
http://www.mail-archive.com/[email protected]/msg04688.html
I found out that the ParserChecker is missing a newline in its report.
E.g., note:
{noformat}
./bin/nutch org.apache.nutch.parse.ParserChecker
http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
{noformat}
produces:
{noformat}
fetching: http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
parsing: http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
contentType: application/xhtml+xml
---------
Url
---------------
http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view---------
ParseData
---------
Version: 5
...snip
{noformat}
Note that there is no space between *view* and -----.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira