[ https://issues.apache.org/jira/browse/NUTCH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473064 ]
Armel Nene commented on NUTCH-437: ---------------------------------- I was wondering if this patch could fix my problem which is, if not the same, very similar to this one. I am using Nutch 0.8.2-dev, I have made checkout awhile ago from SVN but never updated again. I was able to crawl 10000 xml files before with no error whatsoever. This is the following errors that I get when I'm fetching: INFO parser.custom: Custom-parse: Parsing content file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf 07/02/12 22:09:16 INFO fetcher.Fetcher: fetch of file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf failed with: java.lang.NullPointerException 07/02/12 22:09:17 INFO mapred.LocalJobRunner: 0 pages, 0 errors, 0.0 pages/s, 0 kb/s, 07/02/12 22:09:17 FATAL fetcher.Fetcher: java.lang.NullPointerException 07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:198) 07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:189) 07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.hadoop.mapred.MapTask$2.collect(MapTask.java:91) 07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:314) 07/02/12 22:09:17 FATAL fetcher.Fetcher: at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:232) 07/02/12 22:09:17 FATAL fetcher.Fetcher: fetcher caught:java.lang.NullPointerException One of the problem is that my hadoop version says the following: hadoop-0.4.0-patched. Now I don't know if it means that I am running the 0.4.0 version but it seems a little bit confusing. Once you can clarify that for me, then I will be able to apply the patch to my version. Best Regards, Armel > MapFile in Hadoop Trunk has changed, must update references > ----------------------------------------------------------- > > Key: NUTCH-437 > URL: https://issues.apache.org/jira/browse/NUTCH-437 > Project: Nutch > Issue Type: Bug > Affects Versions: 0.8.2, 0.9.0 > Environment: windows xp and java > Reporter: Dennis Kubes > Assigned To: Andrzej Bialecki > Fix For: 0.8.2, 0.9.0 > > Attachments: nutch-hadoop-0.10.2-mapfile.patch > > > The MapFile.Writer signature has changed in hadoop trunk (version 10.x +) to > include a Configuration object. Object in the Nutch codebase that reference > MapFile.Writer will need to be updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers