[ 
https://issues.apache.org/jira/browse/AVRO-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16726953#comment-16726953
 ] 

Martin Jubelgas commented on AVRO-2160:
---------------------------------------

Hi, Lydie,

while I cannot quite make sense of your input data "str str4", I assume that 
the problem you describe is one I've seen other people make. When instantiating 
a json decoder, you need to supply the schema that the data was WRITTEN with, 
not the schema you want to read data with. The default value ("null") is used 
when the field does not exist in the writer's schema, but does in the reader's. 
If your input data does not contain the field "lastname", then the writer 
schema needs to reflect that. In avro, there is no such thing as "not required 
fields". There are fields that are "union \{null, something}", but those need 
to be specified when using the GenericDatumReader if I am not mistaken.

If you want to read json data with "non-required fields", you will need to 
write your own reader (tho that's not too hard) that forgoes the ability of 
schema evolution but might be more flexible in the handling of 
"non-required"/defaultable fields.

That said, I'd say your behaviour is not a bug, therefore, I'd suggest closing 
this ticket.

Regards,

Martin

> Json to Avro with non required value and union schema failing
> -------------------------------------------------------------
>
>                 Key: AVRO-2160
>                 URL: https://issues.apache.org/jira/browse/AVRO-2160
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.8.2
>            Reporter: Lydie
>            Priority: Critical
>              Labels: java
>
> I am trying to convert this string:
> str str4
> using this schema:
> {"type":"record", 
> "namespace":"foo","name":"Person","fields":[\\{"name":"lastname","type": 
> ["null","string"], "default":null}
> ,\{"name":"firstname","type":"string"},{"name":"age","type":["null","int"], 
> "default":null}]}
> I get this error 
> {color:#ff0000}com.syapse.messagePublisher.publisher.AvroEncodeException: 
> Expected field name not found: 
> lastnamein\{"firstname":"John","age":{"int":35}}{color}at 
> com.syapse.messagePublisher.publisher.AvroEncoder.convertJsonToAvro(AvroEncoder.java:78)
>  
> Although this should give me the correct syntax for a non required filed.
> Note that it works for 
> {"lastname":\\{"string" : "Doe"}
> ,"firstname":"John","age":\{"int":36}}
>  
> What am I missing ( using Abro 1.8.2)
> here is my code:
>  
> {code:java}
> public static byte[] convertJsonToAvro(byte[] data, String schemaStr) throws 
> AvroEncodeException {
> InputStream input = null;
> DataFileWriter<GenericRecord> writer = null;
> ByteArrayOutputStream output = null;
> try {
> Schema schema = new Schema.Parser().parse(schemaStr);
> DatumReader<GenericRecord> reader = new 
> GenericDatumReader<GenericRecord>(schema);
> input = new ByteArrayInputStream(data);
> DataInputStream din = new DataInputStream(input);
> output = new ByteArrayOutputStream();
> writer = new DataFileWriter<GenericRecord>(new 
> GenericDatumWriter<GenericRecord>());
> writer.create(schema, output);
> Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
> GenericRecord datum = null;
> while (true) {
> try {
> datum = reader.read(null, decoder);
> } catch (EOFException eofe) {
> break;
> }
> writer.append(datum);
> }
> writer.flush();
> writer.close();
> return output.toByteArray();
> } catch (AvroTypeException e) {
> throw new AvroEncodeException(e.getMessage() + "in" + new String(data));
> } catch (IOException e1) {
> throw new AvroEncodeException("Error decoding Json " + e1.getMessage());
> } finally {
> try {
> input.close();
> } catch (Exception e) {
> }
> }
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to