[
https://issues.apache.org/jira/browse/AVRO-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Federico Ragona updated AVRO-2885:
----------------------------------
Description:
I defined a schema with an {{int}} field and I'm then using the Java
{{GenericDatumReader}} to read records: I have noticed that the reader also
accepts numbers with decimal digits and the resulting record contains the same
numbers, coerced to integer values. I have observed the same behaviour both
when decoding from JSON and from binary. Here is a runnable example:
{code:java}
import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
public class AvroIntTest {
public static GenericRecord readFromJson(Schema schema, String record)
throws Exception {
GenericDatumReader<GenericRecord> reader = new
GenericDatumReader<>(schema, schema);
Decoder decoder = DecoderFactory.get().jsonDecoder(schema, record);
return reader.read(null, decoder);
}
public static void main(String[] args) throws Exception {
Schema schema = SchemaBuilder
.builder("test")
.record("example")
.fields()
.requiredInt("id")
.endRecord();
String record = "{ \"id\": -1.2 }";
System.out.println(readFromJson(schema, record)); // prints: { "id": -1
}
}
}
{code}
The schema generated by the builder looks like this:
{code}
{
"type" : "record",
"name" : "example",
"namespace" : "test",
"fields" : [ {
"name" : "id",
"type" : "int"
} ]
}
{code}
I would expect the reader to fail because the type of the value doesn't match
the type of the field but it instead "silently" succeeds, converting {{-1.2}}
to {{-1}}: is this behaviour intended? am I doing something wrong?
Edit: Digging into it further, I think that the observed behavior is an effect
of this line:
https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java#L168)
A similar test with the JS implementation seems to confirm that it's not
intended behavior:
{code}
var avro = require('avro-js');
var schema = avro.parse('{ "type" : "record", "name" : "example", "namespace" :
"test", "fields" : [ { "name" : "id", "type" : "int" } ] }');
var data = {id: -1.2};
schema.toBuffer(data);
{code}
returns, when run:
{code}
Error: invalid "int": -1.2
{code}
was:
I defined a schema with an {{int}} field and I'm then using the Java
{{GenericDatumReader}} to read records: I have noticed that the reader also
accepts numbers with decimal digits and the resulting record contains the same
numbers, coerced to integer values. I have observed the same behaviour both
when decoding from JSON and from binary. Here is a runnable example:
{code:java}
import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
public class AvroIntTest {
public static GenericRecord readFromJson(Schema schema, String record)
throws Exception {
GenericDatumReader<GenericRecord> reader = new
GenericDatumReader<>(schema, schema);
Decoder decoder = DecoderFactory.get().jsonDecoder(schema, record);
return reader.read(null, decoder);
}
public static void main(String[] args) throws Exception {
Schema schema = SchemaBuilder
.builder("test")
.record("example")
.fields()
.requiredInt("id")
.endRecord();
String record = "{ \"id\": -1.2 }";
System.out.println(readFromJson(schema, record)); // prints: { "id": -1
}
}
}
{code}
The schema generated by the builder looks like this:
{code}
{
"type" : "record",
"name" : "example",
"namespace" : "test",
"fields" : [ {
"name" : "id",
"type" : "int"
} ]
}
{code}
I would expect the reader to fail because the type of the value doesn't match
the type of the field but it instead "silently" succeeds, converting {{-1.2}}
to {{-1}}: is this behaviour intended? am I doing something wrong?
Edit: Digging into it further, I think that the observed behavior is an effect
of this line:
https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java#L168)
> It is possible to provide a number with decimal digits in an int field
> ----------------------------------------------------------------------
>
> Key: AVRO-2885
> URL: https://issues.apache.org/jira/browse/AVRO-2885
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.8.2, 1.10.0, 1.9.2
> Reporter: Federico Ragona
> Priority: Major
>
> I defined a schema with an {{int}} field and I'm then using the Java
> {{GenericDatumReader}} to read records: I have noticed that the reader also
> accepts numbers with decimal digits and the resulting record contains the
> same numbers, coerced to integer values. I have observed the same behaviour
> both when decoding from JSON and from binary. Here is a runnable example:
> {code:java}
> import org.apache.avro.Schema;
> import org.apache.avro.SchemaBuilder;
> import org.apache.avro.generic.GenericDatumReader;
> import org.apache.avro.generic.GenericRecord;
> import org.apache.avro.io.Decoder;
> import org.apache.avro.io.DecoderFactory;
> public class AvroIntTest {
> public static GenericRecord readFromJson(Schema schema, String record)
> throws Exception {
> GenericDatumReader<GenericRecord> reader = new
> GenericDatumReader<>(schema, schema);
> Decoder decoder = DecoderFactory.get().jsonDecoder(schema, record);
> return reader.read(null, decoder);
> }
> public static void main(String[] args) throws Exception {
> Schema schema = SchemaBuilder
> .builder("test")
> .record("example")
> .fields()
> .requiredInt("id")
> .endRecord();
> String record = "{ \"id\": -1.2 }";
> System.out.println(readFromJson(schema, record)); // prints: { "id":
> -1 }
> }
> }
> {code}
> The schema generated by the builder looks like this:
> {code}
> {
> "type" : "record",
> "name" : "example",
> "namespace" : "test",
> "fields" : [ {
> "name" : "id",
> "type" : "int"
> } ]
> }
> {code}
> I would expect the reader to fail because the type of the value doesn't match
> the type of the field but it instead "silently" succeeds, converting {{-1.2}}
> to {{-1}}: is this behaviour intended? am I doing something wrong?
> Edit: Digging into it further, I think that the observed behavior is an
> effect of this line:
> https://github.com/apache/avro/blob/release-1.8.2/lang/java/avro/src/main/java/org/apache/avro/io/JsonDecoder.java#L168)
> A similar test with the JS implementation seems to confirm that it's not
> intended behavior:
> {code}
> var avro = require('avro-js');
> var schema = avro.parse('{ "type" : "record", "name" : "example", "namespace"
> : "test", "fields" : [ { "name" : "id", "type" : "int" } ] }');
> var data = {id: -1.2};
> schema.toBuffer(data);
> {code}
> returns, when run:
> {code}
> Error: invalid "int": -1.2
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)