[ 
https://issues.apache.org/jira/browse/NUTCH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473064
 ] 

Armel Nene commented on NUTCH-437:
----------------------------------

I was wondering if this patch could fix my problem which is, if not the same, 
very similar to this one. I am using Nutch 0.8.2-dev, I have made checkout 
awhile ago from SVN but never updated again. I was able to crawl 10000 xml 
files before with no error whatsoever. This is the following errors that I get 
when I'm fetching:

INFO parser.custom: Custom-parse: Parsing content 
file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf
07/02/12 22:09:16 INFO fetcher.Fetcher: fetch of 
file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf failed with: 
java.lang.NullPointerException
07/02/12 22:09:17 INFO mapred.LocalJobRunner: 0 pages, 0 errors, 0.0 pages/s, 0 
kb/s,
07/02/12 22:09:17 FATAL fetcher.Fetcher: java.lang.NullPointerException
07/02/12 22:09:17 FATAL fetcher.Fetcher: at 
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:198)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at 
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:189)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at 
org.apache.hadoop.mapred.MapTask$2.collect(MapTask.java:91)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:314)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:232)
07/02/12 22:09:17 FATAL fetcher.Fetcher: fetcher 
caught:java.lang.NullPointerException

One of the problem is that my hadoop version says the following: 
hadoop-0.4.0-patched. Now I don't know if it means that I am running the 0.4.0 
version but it seems a little bit confusing. Once you can clarify that for me, 
then I will be able to apply the patch to my version. 

Best Regards,

Armel


> MapFile in Hadoop Trunk has changed, must update references
> -----------------------------------------------------------
>
>                 Key: NUTCH-437
>                 URL: https://issues.apache.org/jira/browse/NUTCH-437
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 0.8.2, 0.9.0
>         Environment: windows xp and java
>            Reporter: Dennis Kubes
>         Assigned To: Andrzej Bialecki 
>             Fix For: 0.8.2, 0.9.0
>
>         Attachments: nutch-hadoop-0.10.2-mapfile.patch
>
>
> The MapFile.Writer signature has changed in hadoop trunk (version 10.x +) to 
> include a Configuration object.  Object in the Nutch codebase that reference 
> MapFile.Writer will need to be updated.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to