[ https://issues.apache.org/jira/browse/NUTCH-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484169#comment-13484169 ]
Julien Nioche commented on NUTCH-1477: -------------------------------------- Found a clue in https://issues.apache.org/jira/browse/NUTCH-842. Not sure what the point of compile-avro-schema is but we need to compile the schemas with gora and not just avro. The generated classes now compile fine. Using the modified schema fails at compilation as the generated objects don't have accessors e.g. getContentType() > NPE when injecting with DataFileAvroStore > ----------------------------------------- > > Key: NUTCH-1477 > URL: https://issues.apache.org/jira/browse/NUTCH-1477 > Project: Nutch > Issue Type: Bug > Components: storage > Affects Versions: 2.1 > Environment: Java 1.6.0_35 > Reporter: Mike Baranczak > Assignee: Julien Nioche > Priority: Critical > Fix For: 2.2 > > Attachments: webpage.avsc > > > Fresh installation of Nutch 2.1, configured to use DataFileAvroStore. > Injection job throws NullPointerException, see below. No error when I switch > to MemStore. > java.lang.NullPointerException > at org.apache.avro.io.BinaryEncoder.writeString(BinaryEncoder.java:133) > at > org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:176) > at > org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:171) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245) > at > org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54) > at > org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185) > at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira