[
https://issues.apache.org/jira/browse/NUTCH-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015888#comment-13015888
]
Markus Jelsma commented on NUTCH-974:
-------------------------------------
Hi,
Nutch 1.1 and 1.2 don't ship with the same configuration. The crawl and parse
works as expected using the configuration shipped with the releases and using
the bin/nutch shell commands.
> Parsing Error in Nutch 1.2 on Windows7
> --------------------------------------
>
> Key: NUTCH-974
> URL: https://issues.apache.org/jira/browse/NUTCH-974
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.2
> Environment: Windows7 64-bit, Cygwin 1.7.9-1
> Reporter: Niksa Jakovljevic
> Assignee: Markus Jelsma
>
> Hello World example of crawling does not work with Nutch 1.2 libs, but works
> fine with Nutch 1.1 libs. Note that same configuration is used in both Nutch
> 1.2 and Nutch 1.1.
> Nutch 1.2 always throws following exception:
> 2011-04-01 16:33:45,177 WARN parse.ParseUtil - Unable to successfully parse
> content http://www.test.com/ of type text/html
> 2011-04-01 16:33:45,177 WARN fetcher.Fetcher - Error parsing:
> http://www.test.com/: failed(2,200): org.apache.nutch.parse.ParseException:
> Unable to successfully parse content
> Thanks,
> Niksa Jakovljevic
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira