[jira] [Comment Edited] (AVRO-1891) Generated Java code fails with union containing logical type

Ryan Blue (JIRA) Mon, 05 Sep 2016 09:43:10 -0700

    [ 
https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463687#comment-15463687
 ]


Ryan Blue edited comment on AVRO-1891 at 9/5/16 4:42 PM:
---------------------------------------------------------

I think all this requires is keeping a set of conversions that should be 
applied when reading or writing a specific class. Unlike generic where the 
conversions are determined by the data model at runtime, the conversions that 
should be used for a specific class are determined at compile time. We have the 
benefit of knowing that the compiler either added conversions for all instances 
of a logical type, or for none of them. So we only need to know the set of 
conversions the compiler had set up when a class was compiled.

Rather than relying on the set of conversions the SpecificData instance has 
configured, I think we should keep the set of conversions for the class being 
written or read. So we don't need to change how SpecificData looks up 
conversions, just the way the SpecificDatumReader/Writer does to avoid looking 
them up in the data model. (I agree with Doug and don't see an advantage of 
adding a conversion resolver.)

What about this:

* Maintain a thread-local reference to the current specific record class in 
SpecificDatumReader
* Add a static conversion map to each specific class with its conversions 
(generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to 
GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the 
current record class's set of conversions instead.

This way, there are no changes to how the data model lookups work, little 
generated code (just an annotation to conversion map), and few changes to the 
datum reader and writers. What do you guys think? I think this would be a bit 
smaller patch. I'll try to put it together tomorrow if I have time.


was (Author: rdblue):
I think all this requires is keeping a set of conversions that should be 
applied when reading or writing a specific class. Unlike generic where the 
conversions are determined by the data model at runtime, the conversions that 
should be used for a specific class are determined at compile time. We have the 
benefit of knowing that the compiler either added conversions for all instances 
of a logical type, or for none of them. So we only need to know the set of 
conversions the compiler had set up when a class was compiled.

Rather than relying on the set of conversions the SpecificData instance has 
configured, I think we should keep the set of conversions for the class being 
written or read. So we don't need to change how SpecificData looks up 
conversions, just the way the SpecificDatumReader/Writer does to avoid looking 
them up in the data model. (I agree with Doug and don't see an advantage of 
adding a conversion resolver.)

What about this:

* Maintain a thread-local reference to the current specific record class in 
GenericDatumReader
* Add a static conversion map to each specific class with its conversions 
(generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to 
GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the 
current record class's set of conversions instead.

This way, there are no changes to how the data model lookups work, little 
generated code (just an annotation to conversion map), and few changes to the 
datum reader and writers. What do you guys think? I think this would be a bit 
smaller patch. I'll try to put it together tomorrow if I have time.

> Generated Java code fails with union containing logical type
> ------------------------------------------------------------
>
>                 Key: AVRO-1891
>                 URL: https://issues.apache.org/jira/browse/AVRO-1891
>             Project: Avro
>          Issue Type: Bug
>          Components: java, logical types
>    Affects Versions: 1.8.1
>            Reporter: Ross Black
>            Priority: Blocker
>             Fix For: 1.8.3
>
>         Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch, 
> AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch
>
>
> Example schema:
> {code}
>     {
>       "type": "record",
>       "name": "RecordV1",
>       "namespace": "org.brasslock.event",
>       "fields": [
>         { "name": "first", "type": ["null", {"type": "long", 
> "logicalType":"timestamp-millis"}]}
>       ]
>     }
> {code}
> The avro compiler generates a field using the relevant joda class:
> {code}
>     public org.joda.time.DateTime first
> {code}
> Running the following code to perform encoding:
> {code}
>         final RecordV1 record = new 
> RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z"));
>         final DatumWriter<RecordV1> datumWriter = new 
> SpecificDatumWriter<>(record.getSchema());
>         final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192);
>         final BinaryEncoder encoder = 
> EncoderFactory.get().directBinaryEncoder(stream, null);
>         datumWriter.write(record, encoder);
>         encoder.flush();
>         final byte[] bytes = stream.toByteArray();
> {code}
> fails with the exception stacktrace:
> {code}
>  org.apache.avro.AvroRuntimeException: Unknown datum type 
> org.joda.time.DateTime: 2016-07-29T10:15:30.000Z
>     at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741)
>     at 
> org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293)
>     at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706)
>     at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
>     at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
>     at 
> org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87)
>     at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
>     at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
>     at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>     at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
>     at 
> org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>     at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>     at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>     at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>     at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>     at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117)
>     at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42)
>     at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:253)
>     at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> {code}
> The failure can be fixed by explicitly adding the relevant conversion(s) to 
> DatumWriter / SpecificData:
> {code}
>         final RecordV1 record = new 
> RecordV1(DateTime.parse("2007-12-03T10:15:30.00Z"));
>         final SpecificData specificData = new SpecificData();
>         specificData.addLogicalTypeConversion(new 
> TimeConversions.TimestampConversion());
>         final DatumWriter<RecordV1> datumWriter = new 
> SpecificDatumWriter<>(record.getSchema(), specificData);
>         final ByteArrayOutputStream stream = new 
> ByteArrayOutputStream(AvroUtil.DEFAULT_BUFFER_SIZE);
>         final BinaryEncoder encoder = 
> EncoderFactory.get().directBinaryEncoder(stream, null);
>         datumWriter.write(record, encoder);
>         encoder.flush();
>         final byte[] bytes = stream.toByteArray();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (AVRO-1891) Generated Java code fails with union containing logical type

Reply via email to