[
https://issues.apache.org/jira/browse/AVRO-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463687#comment-15463687
]
Ryan Blue edited comment on AVRO-1891 at 9/5/16 4:42 PM:
---------------------------------------------------------
I think all this requires is keeping a set of conversions that should be
applied when reading or writing a specific class. Unlike generic where the
conversions are determined by the data model at runtime, the conversions that
should be used for a specific class are determined at compile time. We have the
benefit of knowing that the compiler either added conversions for all instances
of a logical type, or for none of them. So we only need to know the set of
conversions the compiler had set up when a class was compiled.
Rather than relying on the set of conversions the SpecificData instance has
configured, I think we should keep the set of conversions for the class being
written or read. So we don't need to change how SpecificData looks up
conversions, just the way the SpecificDatumReader/Writer does to avoid looking
them up in the data model. (I agree with Doug and don't see an advantage of
adding a conversion resolver.)
What about this:
* Maintain a thread-local reference to the current specific record class in
SpecificDatumReader
* Add a static conversion map to each specific class with its conversions
(generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to
GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the
current record class's set of conversions instead.
This way, there are no changes to how the data model lookups work, little
generated code (just an annotation to conversion map), and few changes to the
datum reader and writers. What do you guys think? I think this would be a bit
smaller patch. I'll try to put it together tomorrow if I have time.
was (Author: rdblue):
I think all this requires is keeping a set of conversions that should be
applied when reading or writing a specific class. Unlike generic where the
conversions are determined by the data model at runtime, the conversions that
should be used for a specific class are determined at compile time. We have the
benefit of knowing that the compiler either added conversions for all instances
of a logical type, or for none of them. So we only need to know the set of
conversions the compiler had set up when a class was compiled.
Rather than relying on the set of conversions the SpecificData instance has
configured, I think we should keep the set of conversions for the class being
written or read. So we don't need to change how SpecificData looks up
conversions, just the way the SpecificDatumReader/Writer does to avoid looking
them up in the data model. (I agree with Doug and don't see an advantage of
adding a conversion resolver.)
What about this:
* Maintain a thread-local reference to the current specific record class in
GenericDatumReader
* Add a static conversion map to each specific class with its conversions
(generated code)
* Add conversion lookup methods to GenericDatumReader that delegate to
GenericData
* Override the conversion lookup methods in SpecificDatumReader that use the
current record class's set of conversions instead.
This way, there are no changes to how the data model lookups work, little
generated code (just an annotation to conversion map), and few changes to the
datum reader and writers. What do you guys think? I think this would be a bit
smaller patch. I'll try to put it together tomorrow if I have time.
> Generated Java code fails with union containing logical type
> ------------------------------------------------------------
>
> Key: AVRO-1891
> URL: https://issues.apache.org/jira/browse/AVRO-1891
> Project: Avro
> Issue Type: Bug
> Components: java, logical types
> Affects Versions: 1.8.1
> Reporter: Ross Black
> Priority: Blocker
> Fix For: 1.8.3
>
> Attachments: AVRO-1891.patch, AVRO-1891.yshi.1.patch,
> AVRO-1891.yshi.2.patch, AVRO-1891.yshi.3.patch, AVRO-1891.yshi.4.patch
>
>
> Example schema:
> {code}
> {
> "type": "record",
> "name": "RecordV1",
> "namespace": "org.brasslock.event",
> "fields": [
> { "name": "first", "type": ["null", {"type": "long",
> "logicalType":"timestamp-millis"}]}
> ]
> }
> {code}
> The avro compiler generates a field using the relevant joda class:
> {code}
> public org.joda.time.DateTime first
> {code}
> Running the following code to perform encoding:
> {code}
> final RecordV1 record = new
> RecordV1(DateTime.parse("2016-07-29T10:15:30.00Z"));
> final DatumWriter<RecordV1> datumWriter = new
> SpecificDatumWriter<>(record.getSchema());
> final ByteArrayOutputStream stream = new ByteArrayOutputStream(8192);
> final BinaryEncoder encoder =
> EncoderFactory.get().directBinaryEncoder(stream, null);
> datumWriter.write(record, encoder);
> encoder.flush();
> final byte[] bytes = stream.toByteArray();
> {code}
> fails with the exception stacktrace:
> {code}
> org.apache.avro.AvroRuntimeException: Unknown datum type
> org.joda.time.DateTime: 2016-07-29T10:15:30.000Z
> at org.apache.avro.generic.GenericData.getSchemaName(GenericData.java:741)
> at
> org.apache.avro.specific.SpecificData.getSchemaName(SpecificData.java:293)
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:706)
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
> at
> org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:87)
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
> at
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
> at
> org.brasslock.avro.compiler.GeneratedRecordTest.shouldEncodeLogicalTypeInUnion(GeneratedRecordTest.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:117)
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42)
> at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:253)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:84)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> {code}
> The failure can be fixed by explicitly adding the relevant conversion(s) to
> DatumWriter / SpecificData:
> {code}
> final RecordV1 record = new
> RecordV1(DateTime.parse("2007-12-03T10:15:30.00Z"));
> final SpecificData specificData = new SpecificData();
> specificData.addLogicalTypeConversion(new
> TimeConversions.TimestampConversion());
> final DatumWriter<RecordV1> datumWriter = new
> SpecificDatumWriter<>(record.getSchema(), specificData);
> final ByteArrayOutputStream stream = new
> ByteArrayOutputStream(AvroUtil.DEFAULT_BUFFER_SIZE);
> final BinaryEncoder encoder =
> EncoderFactory.get().directBinaryEncoder(stream, null);
> datumWriter.write(record, encoder);
> encoder.flush();
> final byte[] bytes = stream.toByteArray();
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)