[jira] [Comment Edited] (AVRO-2137) avro JsonDecoding additional field in array type

2019-02-28 Thread donald cestnik (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781101#comment-16781101
 ] 

donald cestnik edited comment on AVRO-2137 at 2/28/19 11:42 PM:


[~zolyfarkas] I've ran a few tests locally with your fork and for the above 
examples it does correct the issue mentioned but still errors if the additional 
field is of union type. Common scenario touched would be adding a new version 
of a schema containing an additional field unioned with null and default value 
of null. 

 
{code:java}
{
"items": [
{
"name": "dallas",
"state": "TX",
"country": {
"string": "USA"
}
}
],
"firstname": "fname",
"lastname": "lname"
}
{code}


was (Author: dcestnik):
[~zolyfarkas] I've ran a few tests locally with your fork and for the above 
examples it does correct the issue mentioned but still errors if the additional 
field is of union type. Common scenario touched would be adding a new version 
with an additional field, unioned with null and default value of null. 

 
{code:java}
{
"items": [
{
"name": "dallas",
"state": "TX",
"country": {
"string": "USA"
}
}
],
"firstname": "fname",
"lastname": "lname"
}
{code}

> avro JsonDecoding additional field in array type
> 
>
> Key: AVRO-2137
> URL: https://issues.apache.org/jira/browse/AVRO-2137
> Project: Apache Avro
>  Issue Type: Bug
>  Components: java
>Affects Versions: 1.8.1
>Reporter: Arun sethia
>Priority: Major
>
> I have following avro schema:
> {code:json}
> {
>   "type": "record",
>   "name": "test",
>   "namespace": "test.name",
>   "fields": [
> {
>   "name": "items",
>   "type": {
> "type": "array",
> "items": {
>   "type": "record",
>   "name": "items",
>   "fields": [
> {
>   "name": "name",
>   "type": "string"
> },
> {
>   "name": "state",
>   "type": "string"
> }
>   ]
> }
>   }
> },
> {
>   "name": "firstname",
>   "type": "string"
> }
>   ]
> }
> {code}
> when I am using Json decoder and avro encoder to encode Json data (scala 
> code):
>  {code:scala}
> val writer = new GenericDatumWriter[GenericRecord](schema)
> val reader = new GenericDatumReader[GenericRecord](schema)
> val baos = new ByteArrayOutputStream
> val decoder: JsonDecoder = DecoderFactory.get.jsonDecoder(schema, json)
> val encoder = EncoderFactory.get.binaryEncoder(baos, null)
> val datum = reader.read(null, decoder) writer.write(datum, encoder)
> encoder.flush()
> val avroByteArray = baos.toByteArray
>  {code}
> *scenario1:* when I am passing following json to encode it works fine:
> {code:json}
>  {
>   "items": [
> {
>   "name": "dallas",
>   "state": "TX"
> }
>   ],
>   "firstname": "arun"
> }
> {code}
>  *scenario2:* when I am passing additional attribute in json at root level 
> (lastname) it is able to encode and works fine:
> {code:json}
> {
>   "items": [
> {
>   "name": "dallas",
>   "state": "TX"
> }
>   ],
>   "firstname": "fname",
>   "lastname": "lname"
> }
> {code}
> *scenario3*: when I am add additional attribute in array record (country) it 
> is throwing following exception:
> {code:scala}
> Expected record-end. Got FIELD_NAME org.apache.avro.AvroTypeException: 
> Expected record-end. Got FIELD_NAME at 
> org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:698) { "items": [
> { "name": "dallas", "state": "TX", "country":"USA" }
> ], "firstname":"fname", "lastname":"lname" }
> {code}
>  In case of if we have any additional element in array type, it should work 
> in same way as normal record; it should just discard them and decode the Json 
> data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (AVRO-2137) avro JsonDecoding additional field in array type

2019-02-28 Thread Zoltan Farkas (JIRA)


[ 
https://issues.apache.org/jira/browse/AVRO-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780913#comment-16780913
 ] 

Zoltan Farkas edited comment on AVRO-2137 at 2/28/19 9:06 PM:
--

I am not seeing this in my fork (https://github.com/zolyfarkas/avro), can you 
please review my attempt to reproduce the issue?:

{code}

package org.apache.avro.io;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericRecord;
import org.junit.Test;

public class JsonDecoderTest {


  private static final String SCHEMA
= "{\n" +
"  \"type\": \"record\",\n" +
"  \"name\": \"test\",\n" +
"  \"namespace\": \"test.name\",\n" +
"  \"fields\": [\n" +
"{\n" +
"  \"name\": \"items\",\n" +
"  \"type\": {\n" +
"\"type\": \"array\",\n" +
"\"items\": {\n" +
"  \"type\": \"record\",\n" +
"  \"name\": \"items\",\n" +
"  \"fields\": [\n" +
"{\n" +
"  \"name\": \"name\",\n" +
"  \"type\": \"string\"\n" +
"},\n" +
"{\n" +
"  \"name\": \"state\",\n" +
"  \"type\": \"string\"\n" +
"}\n" +
"  ]\n" +
"}\n" +
"  }\n" +
"},\n" +
"{\n" +
"  \"name\": \"firstname\",\n" +
"  \"type\": \"string\"\n" +
"}\n" +
"  ]\n" +
"}";


  private static final String testData = "{ \"items\": [\n" +
"\n" +
"{ \"name\": \"dallas\", \"state\": \"TX\", \"country\":\"USA\" }\n" +
"\n" +
"], \"firstname\":\"fname\", \"lastname\":\"lname\" }";

  @Test
  public void testDecoding() throws IOException {
Schema writerSchema = new Schema.Parser().parse(SCHEMA);
Schema readerSchema = writerSchema;
ByteArrayInputStream bis =
new ByteArrayInputStream(testData.getBytes(StandardCharsets.UTF_8));
Decoder decoder = DecoderFactory.get().jsonDecoder(writerSchema, bis);
GenericDatumReader reader = new GenericDatumReader(writerSchema, 
readerSchema);
GenericRecord testData = (GenericRecord) reader.read(null, decoder);
System.out.println(testData);
  }

}
{code}

this might be caused by the fact that my fork contains the fix for: 
https://issues.apache.org/jira/browse/AVRO-2057...
or that my attempt to reproduce is broken...


was (Author: zolyfarkas):
I am not seeing this in my fork (https://github.com/zolyfarkas/avro), can you 
please review my attempt to reproduce the issue?:

{code}

package org.apache.avro.io;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericRecord;
import org.junit.Test;

public class JsonDecoderTest {


  private static final String SCHEMA
= "{\n" +
"  \"type\": \"record\",\n" +
"  \"name\": \"test\",\n" +
"  \"namespace\": \"test.name\",\n" +
"  \"fields\": [\n" +
"{\n" +
"  \"name\": \"items\",\n" +
"  \"type\": {\n" +
"\"type\": \"array\",\n" +
"\"items\": {\n" +
"  \"type\": \"record\",\n" +
"  \"name\": \"items\",\n" +
"  \"fields\": [\n" +
"{\n" +
"  \"name\": \"name\",\n" +
"  \"type\": \"string\"\n" +
"},\n" +
"{\n" +
"  \"name\": \"state\",\n" +
"  \"type\": \"string\"\n" +
"}\n" +
"  ]\n" +
"}\n" +
"  }\n" +
"},\n" +
"{\n" +
"  \"name\": \"firstname\",\n" +
"  \"type\": \"string\"\n" +
"}\n" +
"  ]\n" +
"}";


  private static final String testData = "{ \"items\": [\n" +
"\n" +
"{ \"name\": \"dallas\", \"state\": \"TX\", \"country\":\"USA\" }\n" +
"\n" +
"], \"firstname\":\"fname\", \"lastname\":\"lname\" }";

  @Test
  public void testDecoding() throws IOException {
Schema writerSchema = new Schema.Parser().parse(SCHEMA);
Schema readerSchema = writerSchema;
ByteArrayInputStream bis =
new ByteArrayInputStream(testData.getBytes(StandardCharsets.UTF_8));
Decoder decoder = DecoderFactory.get().jsonDecoder(writerSchema, bis);
GenericDatumReader reader = new GenericDatumReader(writerSchema, 
readerSchema);
GenericRecord testData = (GenericRecord) reader.read(null, decoder);
System.out.println(testData);
  }

}
{code}

this might be caused by the fact that contains the fix for: 
https://issues.apache.org/jira/browse/AVRO-2057...
or that my attempt to reproduce is broken...

> avro JsonDecoding additional field in array type
> 
>
> Key: AVRO-2137
> URL: https://issues.apache.org/jira/browse/AVRO-2137
> Project: Apache Avro
>  Issue Type: Bug
>