[ 
https://issues.apache.org/jira/browse/NUTCH-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993481#comment-13993481
 ] 

Julien Nioche commented on NUTCH-1714:
--------------------------------------

We are getting 
{code}
java.util.concurrent.ExecutionException: java.lang.NullPointerException
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
        at java.util.concurrent.FutureTask.get(FutureTask.java:91)
        at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:147)
        at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:128)
        at org.apache.nutch.parse.ParserChecker.run(ParserChecker.java:142)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.parse.ParserChecker.main(ParserChecker.java:199)
Caused by: java.lang.NullPointerException
        at 
org.apache.nutch.parse.ParseStatusUtils.getEmptyParse(ParseStatusUtils.java:91)
        at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:92)
        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:36)
        at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:23)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)
{code}

when the parser fails. This is due to  status.getArgs() returning null. 

./nutch parsechecker -D parse.timeout=-1 
"http://api.addthis.com/oexchange/0.8/forward/delicious/offer?username=addthiseere&url=www1.eere.energy.gov/buildings/commercial/news_detail.html%253Fnews_id=18485&title=Building%2520Technologies%2520Program:%2520News";

will illustrate the issue.

We should not get this NPE when a parser fails.



> Nutch 2.x upgrade to Gora 0.4
> -----------------------------
>
>                 Key: NUTCH-1714
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1714
>             Project: Nutch
>          Issue Type: Improvement
>            Reporter: Alparslan Avcı
>            Assignee: Alparslan Avcı
>             Fix For: 2.3
>
>         Attachments: NUTCH-1714.patch, NUTCH-1714_NUTCH-1714_v2_v3.patch, 
> NUTCH-1714v2.patch, NUTCH-1714v4.patch, NUTCH-1714v5.patch
>
>
> Nutch upgrade for GORA_94 branch has to be implemented. We can discuss the 
> details in this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to