[ 
https://issues.apache.org/jira/browse/NUTCH-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418999#comment-13418999
 ] 

Markus Jelsma commented on NUTCH-1433:
--------------------------------------

{code}
2012-07-20 10:15:49,402 WARN  parse.ParserFactory - ParserFactory: Plugin: 
org.apache.nutch.parse.html.HtmlParser mapped to contentType 
application/xhtml+xml via parse-plugins.xml, but not enabled via 
plugin.includes in nutch-default.xml
2012-07-20 10:15:51,065 WARN  parse.ParseUtil - Error parsing 
http://zh.wikipedia.org/wiki/日语 with 
org.apache.nutch.parse.tika.TikaParser@501ba94d
java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: 
org.apache.tika.mime.MediaType.set([Lorg/apache/tika/mime/MediaType;)Ljava/util/Set;
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
        at java.util.concurrent.FutureTask.get(FutureTask.java:91)
        at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:162)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:93)
        at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:102)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:139)
Caused by: java.lang.NoSuchMethodError: 
org.apache.tika.mime.MediaType.set([Lorg/apache/tika/mime/MediaType;)Ljava/util/Set;
        at 
org.apache.tika.parser.crypto.Pkcs7Parser.getSupportedTypes(Pkcs7Parser.java:52)
        at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
        at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:148)
        at org.apache.tika.config.TikaConfig.getParser(TikaConfig.java:230)
        at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:79)
        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:35)
        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:24)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
2012-07-20 10:15:51,067 WARN  parse.ParseUtil - Unable to successfully parse 
content http://zh.wikipedia.org/wiki/日语 of type application/xhtml+xml
{code}
                
> Upgrade to Tika 1.2
> -------------------
>
>                 Key: NUTCH-1433
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1433
>             Project: Nutch
>          Issue Type: Improvement
>          Components: parser
>            Reporter: Julien Nioche
>            Assignee: Julien Nioche
>             Fix For: 1.6, 2.1
>
>         Attachments: NUTCH-1433-trunk.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to