[
https://issues.apache.org/jira/browse/AVRO-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sachin Goyal updated AVRO-1554:
-------------------------------
Attachment: AVRO-1554_2.patch
{quote}
That works if the values are not null, but if they're null it fails.
The bug is with AllowNull, since it changes the schema to be a union, which has
a different encoding. Custom encodings are associated with the field's type and
know nothing of the union that AllowNull has inserted. So allowNull() should
perhaps override getFieldAccessor() and wrap the value of
super.getFieldAccessor() with an implementation that handles unions with null.
{quote}
Great catch! I have fixed this in the new patch.
However, I could not override getFieldAccessor() because the corresponding
*Field* object is not there. So I added the methods
ReflectDatumReader#readFieldWithAccessor() and
ReflectDatumWriter#writeFieldWithAccessor(). Please suggest how you feel about
this.
\\
\\
{quote}
Also, if the DatumWriter is passed an AllowNull, shouldn't the DatumReader be
passed one too?
And, yes, I don't think we should add AvroConfiguration in this patch.
{quote}
AvroConfiguration has been removed.
\\
\\
Also updated the test to use a parameterized version such that it tests with
ReflectData.AllowNull as well as with plain ReflectData.
> Avro should have support for common constructs like UUID and Date
> -----------------------------------------------------------------
>
> Key: AVRO-1554
> URL: https://issues.apache.org/jira/browse/AVRO-1554
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.6
> Reporter: Sachin Goyal
> Attachments: AVRO-1554.patch, AVRO-1554_2.patch,
> CustomEncodingUnionBug.zip
>
>
> Consider the following code:
> {code}
> public class AvroExample
> {
> public static void main (String [] args) throws Exception
> {
> ReflectData rdata = ReflectData.AllowNull.get();
> Schema schema = rdata.getSchema(Temp.class);
>
> ReflectDatumWriter<Temp> datumWriter =
> new ReflectDatumWriter (Temp.class, rdata);
> DataFileWriter<Temp> fileWriter =
> new DataFileWriter<Temp> (datumWriter);
> ByteArrayOutputStream baos = new ByteArrayOutputStream();
> fileWriter.create(schema, baos);
> fileWriter.append(new Temp());
> fileWriter.close();
> byte[] bytes = baos.toByteArray();
> GenericDatumReader<GenericRecord> datumReader =
> new GenericDatumReader<GenericRecord> ();
> SeekableByteArrayInput avroInputStream =
> new SeekableByteArrayInput(bytes);
> DataFileReader<GenericRecord> fileReader =
> new DataFileReader<GenericRecord>(avroInputStream,
> datumReader);
> schema = fileReader.getSchema();
> GenericRecord record = null;
> record = fileReader.next(record);
> System.out.println (record);
> System.out.println (record.get("id"));
> }
> }
> class Temp
> {
> UUID id = UUID.randomUUID();
> Date date = new Date();
> BigInteger bi = BigInteger.TEN;
> }
> {code}
> Output from this code is:
> {code:javascript}
> {"id": {}, "date": {}, "bi": "10"}
> {code}
> UUID and Date type fields are very common in Java and can be found a lot in
> third-party code as well (where it may be difficult to put annotations).
> So Avro should include a default serialization/deserialization support for
> such fields.
--
This message was sent by Atlassian JIRA
(v6.2#6252)