Hi,
Thanks for the suggestions! I tried changing to 0.0 as opposed to "0.0" with no
success. Please note that I am on AVRO 1.2, as there is an incompatibility
between hadoop 0.20 and newer versions of avro.
It seems that the question how I (de-)serialized the object could lead to an
answer. I read the avro instance directly from an inputstream. The data in the
stream has been serialized using the following code:
public static void store(final SpecificRecord m, final OutputStream out) throws
IOException {
final SpecificDatumWriter datumWriter = new
SpecificDatumWriter(m.getSchema());
final BinaryEncoder enc = new BinaryEncoder(out);
datumWriter.write(m, enc);
enc.flush();
}
I read from the stream using:
public static SpecificRecord load(final InputStream in) throws IOException {
final SpecificDatumReader reader = new
SpecificDatumReader(THECLASS._SCHEMA);
final BinaryDecoder decoder = new BinaryDecoder(in);
return ( SpecificRecord ) reader.read(null, decoder);
}
Presumably, this does not serialize the schema with the data, correct? That
would explain the problem. I know that avro files do serialize the schema at
the beginning. Is there a similar tool for writing to streams?
Thanks,
Markus
On 8/2/10 6:01 PM, "Scott Carey" <[email protected]> wrote:
How was this GenericDatumReader constructed? Is it used to read from an Avro
file or from something else?
Note that you may have to set the "expected" schema separately from the actual
schema. Avro needs to know what the schema was when it was written, in the
Avro data file this is persisted with it and automatically set when read.
On Aug 2, 2010, at 4:28 PM, Markus Weimer wrote:
> Hi,
>
> I added the following line to a schema, recreated the static java classes
> for it and compiled my code:
>
> {"name": "bias", "type":"double", "default":"0.0"}
>
> When I now try to read a file written before the change, I get an error:
>
> Exception in thread "main" java.io.EOFException
> at
> org.apache.avro.io.BinaryDecoder.readDouble(BinaryDecoder.java:154)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:82)
> at
> org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java
> :273)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:74)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.jav
> a:154)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:72)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:61)
>
>
> I assumed that it would just return 0.0 for the fields not present in the
> file. Is this a bug on my end?
>
> Thanks,
>
> Markus
>