[ https://issues.apache.org/jira/browse/NUTCH-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484148#comment-13484148 ]
Julien Nioche commented on NUTCH-1477: -------------------------------------- I found in http://mail-archives.apache.org/mod_mbox/avro-user/200910.mbox/%3c4ae78503.50...@apache.org%3E that we probably need to explicitly allow for null values in the schema (see attachment). I tried recompiling the schemas with {{ant compile-avro-schema}} but the classes generated do not compile and are nowhere near as complete as the original ones. More worryingly the same is true with the original schema. I assumed that the code in org.apache.nutch.storage could be generated from the schemas. Any idea? > NPE when injecting with DataFileAvroStore > ----------------------------------------- > > Key: NUTCH-1477 > URL: https://issues.apache.org/jira/browse/NUTCH-1477 > Project: Nutch > Issue Type: Bug > Components: storage > Affects Versions: 2.1 > Environment: Java 1.6.0_35 > Reporter: Mike Baranczak > Assignee: Julien Nioche > Fix For: 2.2 > > Attachments: webpage.avsc > > > Fresh installation of Nutch 2.1, configured to use DataFileAvroStore. > Injection job throws NullPointerException, see below. No error when I switch > to MemStore. > java.lang.NullPointerException > at org.apache.avro.io.BinaryEncoder.writeString(BinaryEncoder.java:133) > at > org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:176) > at > org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:171) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245) > at > org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54) > at > org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185) > at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira