[
https://issues.apache.org/jira/browse/AVRO-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dan Lipofsky updated AVRO-2904:
-------------------------------
Description:
In Avro 1.10.0 the setter for logical-types {{timestamp-millis}} and
{{time-millis}} will truncate the timestamp to millis in the simplest condition
but not when it is in a union with null.
It looks like something similar was addressed in AVRO-2360 but that was
supposedly fixed in 1.9.0 and this is still broken in 1.10.0.
Note: it would appear that in Java8 the {{Instant.now()}} is already truncated
to millis so it usually isn't a problem, but in Java11 you get microsecond
precision (at least on macOS 10.14.6 with recent hardware). However you can see
the bug even in Java8 if you force the microseconds the way I did in my test.
I uploaded by code to [https://github.com/dlipofsky/AVRO-2904]
Here are my schemas
{noformat}
{
"type" : "record",
"name" : "TimestampTest1",
"fields" : [ {
"name" : "time",
"type" : [ "null", { "type" : "long", "logicalType" : "timestamp-millis" } ]
} ]
}
{noformat}
{noformat}
{
"type" : "record",
"name" : "TimestampTest2",
"fields" : [ {
"name" : "time",
"type" : { "type" : "long", "logicalType" : "timestamp-millis" }
} ]
}
{noformat}
Here is my test, in which {{test2}} passes but {{test1}} fails.
{noformat}
public class Avro2904Test {
// If you just used Instant.now() test would randomly pass 1 times out of a
1000.
// If you use Instant.now().truncatedTo(ChronoUnit.MILLIS) it would always
pass.
// By making sure micros is non-zero we ensure it always fails and
demonstrates AVRO-2904.
private Instant now =
Instant.now().truncatedTo(ChronoUnit.MILLIS).plus(999, ChronoUnit.MICROS);
@Test
public void test1() throws IOException {
final TimestampTest1 t =
TimestampTest1.newBuilder().setTime(now).build();
byte[] bytes = serialize(t);
TimestampTest1 other = deserialize(t.getSchema(), bytes);
// fails because of AVRO-2904
assertEquals(t, other);
}
@Test
public void test2() throws IOException {
final TimestampTest2 t =
TimestampTest2.newBuilder().setTime(Instant.now()).build();
byte[] bytes = serialize(t);
TimestampTest2 other = deserialize(t.getSchema(), bytes);
// if logicalType timestamp-millis in not in union with null it works
fine
assertEquals(t, other);
}
private byte[] serialize(SpecificRecord record) throws IOException {
try (final ByteArrayOutputStream out = new ByteArrayOutputStream()) {
SpecificDatumWriter<SpecificRecord> writer = new
SpecificDatumWriter<>(
record.getSchema());
Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
writer.write(record, encoder);
encoder.flush();
final byte[] bytes = out.toByteArray();
return bytes;
}
}
private <T extends SpecificRecord> T deserialize(Schema schema, byte[]
bytes)
throws IOException {
Decoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
SpecificDatumReader<T> reader = new SpecificDatumReader<>(schema);
return reader.read(null, decoder);
}
}
{noformat}
{{test1}} fails with
{noformat}
java.lang.AssertionError:
Expected :{"time": 2020-07-30T23:14:08.294999Z}
Actual :{"time": 2020-07-30T23:14:08.294Z}
{noformat}
If I change my tests to use {{Instant.now().truncatedTo(ChronoUnit.MILLIS)}}
they'll both pass, but it seems like that should not be necessary, particularly
given the code below.
Looking at the generated Java DTO, I see a few differences, but the most
important would seem to be that the working DTO has
{noformat}
public void setTime(java.time.Instant value) {
this.time = value.truncatedTo(java.time.temporal.ChronoUnit.MILLIS);
}
{noformat}
while the broken one has
{noformat}
public void setTime(java.time.Instant value) {
this.time = value;
}
{noformat}
The {{Builder}} is similar (only the non-union version has {{truncatedTo}} in
the setter).
The working one also has this code, which is missing from the broken one:
{noformat}
private static final org.apache.avro.Conversion<?>[] conversions =
new org.apache.avro.Conversion<?>[] {
new org.apache.avro.data.TimeConversions.TimestampMillisConversion(),
null
};
@Override
public org.apache.avro.Conversion<?> getConversion(int field) {
return conversions[field];
}
{noformat}
was:
In Avro 1.10.0 the setter for logical-type timestamp-millis will truncate the
timestamp to millis in the simplest condition but not when it is in a union
with null.
It looks like something similar was addressed in AVRO-2360 but that was
supposedly fixed in 1.9.0 and this is still broken in 1.10.0.
Note: it would appear that in Java8 the {{Instant.now()}} is already truncated
to millis so it usually isn't a problem, but in Java11 you get microsecond
precision (at least on macOS 10.14.6 with recent hardware). However you can see
the bug even in Java8 if you force the microseconds the way I did in my test.
I uploaded by code to [https://github.com/dlipofsky/AVRO-2904]
Here are my schemas
{noformat}
{
"type" : "record",
"name" : "TimestampTest1",
"fields" : [ {
"name" : "time",
"type" : [ "null", { "type" : "long", "logicalType" : "timestamp-millis" } ]
} ]
}
{noformat}
{noformat}
{
"type" : "record",
"name" : "TimestampTest2",
"fields" : [ {
"name" : "time",
"type" : { "type" : "long", "logicalType" : "timestamp-millis" }
} ]
}
{noformat}
Here is my test, in which {{test2}} passes but {{test1}} fails.
{noformat}
public class Avro2904Test {
// If you just used Instant.now() test would randomly pass 1 times out of a
1000.
// If you use Instant.now().truncatedTo(ChronoUnit.MILLIS) it would always
pass.
// By making sure micros is non-zero we ensure it always fails and
demonstrates AVRO-2904.
private Instant now =
Instant.now().truncatedTo(ChronoUnit.MILLIS).plus(999, ChronoUnit.MICROS);
@Test
public void test1() throws IOException {
final TimestampTest1 t =
TimestampTest1.newBuilder().setTime(now).build();
byte[] bytes = serialize(t);
TimestampTest1 other = deserialize(t.getSchema(), bytes);
// fails because of AVRO-2904
assertEquals(t, other);
}
@Test
public void test2() throws IOException {
final TimestampTest2 t =
TimestampTest2.newBuilder().setTime(Instant.now()).build();
byte[] bytes = serialize(t);
TimestampTest2 other = deserialize(t.getSchema(), bytes);
// if logicalType timestamp-millis in not in union with null it works
fine
assertEquals(t, other);
}
private byte[] serialize(SpecificRecord record) throws IOException {
try (final ByteArrayOutputStream out = new ByteArrayOutputStream()) {
SpecificDatumWriter<SpecificRecord> writer = new
SpecificDatumWriter<>(
record.getSchema());
Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
writer.write(record, encoder);
encoder.flush();
final byte[] bytes = out.toByteArray();
return bytes;
}
}
private <T extends SpecificRecord> T deserialize(Schema schema, byte[]
bytes)
throws IOException {
Decoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
SpecificDatumReader<T> reader = new SpecificDatumReader<>(schema);
return reader.read(null, decoder);
}
}
{noformat}
{{test1}} fails with
{noformat}
java.lang.AssertionError:
Expected :{"time": 2020-07-30T23:14:08.294999Z}
Actual :{"time": 2020-07-30T23:14:08.294Z}
{noformat}
If I change my tests to use {{Instant.now().truncatedTo(ChronoUnit.MILLIS)}}
they'll both pass, but it seems like that should not be necessary, particularly
given the code below.
Looking at the generated Java DTO, I see a few differences, but the most
important would seem to be that the working DTO has
{noformat}
public void setTime(java.time.Instant value) {
this.time = value.truncatedTo(java.time.temporal.ChronoUnit.MILLIS);
}
{noformat}
while the broken one has
{noformat}
public void setTime(java.time.Instant value) {
this.time = value;
}
{noformat}
The {{Builder}} is similar (only the non-union version has {{truncatedTo}} in
the setter).
The working one also has this code, which is missing from the broken one:
{noformat}
private static final org.apache.avro.Conversion<?>[] conversions =
new org.apache.avro.Conversion<?>[] {
new org.apache.avro.data.TimeConversions.TimestampMillisConversion(),
null
};
@Override
public org.apache.avro.Conversion<?> getConversion(int field) {
return conversions[field];
}
{noformat}
> timestamp-millis doesn't truncate when in a union with null
> -----------------------------------------------------------
>
> Key: AVRO-2904
> URL: https://issues.apache.org/jira/browse/AVRO-2904
> Project: Apache Avro
> Issue Type: Bug
> Components: java, logical types
> Affects Versions: 1.10.0
> Reporter: Dan Lipofsky
> Priority: Major
>
> In Avro 1.10.0 the setter for logical-types {{timestamp-millis}} and
> {{time-millis}} will truncate the timestamp to millis in the simplest
> condition but not when it is in a union with null.
> It looks like something similar was addressed in AVRO-2360 but that was
> supposedly fixed in 1.9.0 and this is still broken in 1.10.0.
> Note: it would appear that in Java8 the {{Instant.now()}} is already
> truncated to millis so it usually isn't a problem, but in Java11 you get
> microsecond precision (at least on macOS 10.14.6 with recent hardware).
> However you can see the bug even in Java8 if you force the microseconds the
> way I did in my test.
> I uploaded by code to [https://github.com/dlipofsky/AVRO-2904]
> Here are my schemas
> {noformat}
> {
> "type" : "record",
> "name" : "TimestampTest1",
> "fields" : [ {
> "name" : "time",
> "type" : [ "null", { "type" : "long", "logicalType" : "timestamp-millis"
> } ]
> } ]
> }
> {noformat}
> {noformat}
> {
> "type" : "record",
> "name" : "TimestampTest2",
> "fields" : [ {
> "name" : "time",
> "type" : { "type" : "long", "logicalType" : "timestamp-millis" }
> } ]
> }
> {noformat}
> Here is my test, in which {{test2}} passes but {{test1}} fails.
> {noformat}
> public class Avro2904Test {
> // If you just used Instant.now() test would randomly pass 1 times out of
> a 1000.
> // If you use Instant.now().truncatedTo(ChronoUnit.MILLIS) it would
> always pass.
> // By making sure micros is non-zero we ensure it always fails and
> demonstrates AVRO-2904.
> private Instant now =
> Instant.now().truncatedTo(ChronoUnit.MILLIS).plus(999, ChronoUnit.MICROS);
> @Test
> public void test1() throws IOException {
> final TimestampTest1 t =
> TimestampTest1.newBuilder().setTime(now).build();
> byte[] bytes = serialize(t);
> TimestampTest1 other = deserialize(t.getSchema(), bytes);
> // fails because of AVRO-2904
> assertEquals(t, other);
> }
> @Test
> public void test2() throws IOException {
> final TimestampTest2 t =
> TimestampTest2.newBuilder().setTime(Instant.now()).build();
> byte[] bytes = serialize(t);
> TimestampTest2 other = deserialize(t.getSchema(), bytes);
> // if logicalType timestamp-millis in not in union with null it works
> fine
> assertEquals(t, other);
> }
> private byte[] serialize(SpecificRecord record) throws IOException {
> try (final ByteArrayOutputStream out = new ByteArrayOutputStream()) {
> SpecificDatumWriter<SpecificRecord> writer = new
> SpecificDatumWriter<>(
> record.getSchema());
> Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
> writer.write(record, encoder);
> encoder.flush();
> final byte[] bytes = out.toByteArray();
> return bytes;
> }
> }
> private <T extends SpecificRecord> T deserialize(Schema schema, byte[]
> bytes)
> throws IOException {
> Decoder decoder = DecoderFactory.get().binaryDecoder(bytes, null);
> SpecificDatumReader<T> reader = new SpecificDatumReader<>(schema);
> return reader.read(null, decoder);
> }
> }
> {noformat}
> {{test1}} fails with
> {noformat}
> java.lang.AssertionError:
> Expected :{"time": 2020-07-30T23:14:08.294999Z}
> Actual :{"time": 2020-07-30T23:14:08.294Z}
> {noformat}
> If I change my tests to use {{Instant.now().truncatedTo(ChronoUnit.MILLIS)}}
> they'll both pass, but it seems like that should not be necessary,
> particularly given the code below.
> Looking at the generated Java DTO, I see a few differences, but the most
> important would seem to be that the working DTO has
> {noformat}
> public void setTime(java.time.Instant value) {
> this.time = value.truncatedTo(java.time.temporal.ChronoUnit.MILLIS);
> }
> {noformat}
> while the broken one has
> {noformat}
> public void setTime(java.time.Instant value) {
> this.time = value;
> }
> {noformat}
> The {{Builder}} is similar (only the non-union version has {{truncatedTo}} in
> the setter).
> The working one also has this code, which is missing from the broken one:
> {noformat}
> private static final org.apache.avro.Conversion<?>[] conversions =
> new org.apache.avro.Conversion<?>[] {
> new org.apache.avro.data.TimeConversions.TimestampMillisConversion(),
> null
> };
> @Override
> public org.apache.avro.Conversion<?> getConversion(int field) {
> return conversions[field];
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)