[
https://issues.apache.org/jira/browse/NUTCH-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156982#comment-13156982
]
Hudson commented on NUTCH-1209:
-------------------------------
Integrated in Nutch-trunk #1674 (See
[https://builds.apache.org/job/Nutch-trunk/1674/])
Fix for NUTCH-1209: Output from ParserChecker Url missing a newline
mattmann :
http://svn.apache.org/viewvc/nutch/trunk/viewvc/?view=rev&root=&revision=1205729
Files :
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/src/java/org/apache/nutch/parse/ParserChecker.java
> Output from ParserChecker Url missing a newline
> -----------------------------------------------
>
> Key: NUTCH-1209
> URL: https://issues.apache.org/jira/browse/NUTCH-1209
> Project: Nutch
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.4
> Environment: While testing this:
> http://www.mail-archive.com/[email protected]/msg04688.html
> Reporter: Chris A. Mattmann
> Assignee: Chris A. Mattmann
> Priority: Trivial
> Fix For: 1.5
>
>
> While working on:
> http://www.mail-archive.com/[email protected]/msg04688.html
> I found out that the ParserChecker is missing a newline in its report.
> E.g., note:
> {noformat}
> ./bin/nutch org.apache.nutch.parse.ParserChecker
> http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
> {noformat}
> produces:
> {noformat}
> fetching: http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
> parsing: http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view
> contentType: application/xhtml+xml
> ---------
> Url
> ---------------
> http://vault.fbi.gov/watergate/watergate-summary-part-01-of-02/view---------
> ParseData
> ---------
> Version: 5
> ...snip
> {noformat}
> Note that there is no space between *view* and -----.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira