Guys,

Any suggestions for the issue below?  Would be great to be able to use
the DataFileAvroStore from Nutch

Thanks

Julien

---------- Forwarded message ----------
From: Julien Nioche (JIRA) <[email protected]>
Date: 25 October 2012 15:45
Subject: [jira] [Commented] (NUTCH-1477) NPE when injecting with
DataFileAvroStore
To: [email protected]

    [
https://issues.apache.org/jira/browse/NUTCH-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484169#comment-13484169]

Julien Nioche commented on NUTCH-1477:
--------------------------------------

Found a clue in https://issues.apache.org/jira/browse/NUTCH-842. Not sure
what the point of compile-avro-schema is but we need to compile the schemas
with gora and not just avro. The generated classes now compile fine.

Using the modified schema fails at compilation as the generated objects
don't have accessors e.g. getContentType()



> NPE when injecting with DataFileAvroStore
> -----------------------------------------
>
>                 Key: NUTCH-1477
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1477
>             Project: Nutch
>          Issue Type: Bug
>          Components: storage
>    Affects Versions: 2.1
>         Environment: Java 1.6.0_35
>            Reporter: Mike Baranczak
>            Assignee: Julien Nioche
>            Priority: Critical
>             Fix For: 2.2
>
>         Attachments: webpage.avsc
>
>
> Fresh installation of Nutch 2.1, configured to use DataFileAvroStore.
Injection job throws NullPointerException, see below. No error when I
switch to MemStore.
> java.lang.NullPointerException
>       at
org.apache.avro.io.BinaryEncoder.writeString(BinaryEncoder.java:133)
>       at
org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:176)
>       at
org.apache.avro.generic.GenericDatumWriter.writeString(GenericDatumWriter.java:171)
>       at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72)
>       at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:89)
>       at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62)
>       at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:55)
>       at
org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:245)
>       at
org.apache.gora.avro.store.DataFileAvroStore.put(DataFileAvroStore.java:54)
>       at
org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:60)
>       at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639)
>       at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>       at
org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:185)
>       at
org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:85)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>       at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to