[ 
https://issues.apache.org/jira/browse/AVRO-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513391#comment-15513391
 ] 

Zoltan Farkas commented on AVRO-1582:
-------------------------------------

Hi Sean, I will provide a update from my side,  I am currently still stuck to 
get AVRO-1723 in.(working on Ryan's suggestions... he should get some code to 
review soon), after which I was planning to tackle this JIRA...

I will provide some detail on the implementation in case somebody wants to work 
on this.

My implementation is currently:
https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/specific/ExtendedSpecificDatumWriter.java
https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/reflect/ExtendedReflectDatumWriter.java
https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/ExtendedGenericDatumWriter.java
https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/ExtendedJsonDecoder.java
https://github.com/zolyfarkas/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/io/ExtendedJsonEncoder.java

here is what needs to be considered:

1) Currently implementation does: a) optimizes union {null, something} b) 
omits/infers fields that are equal with the default values. b) is very useful 
in the world that uses schemas by reducing the size of the payload. But I can 
see issues with the schema-less crowd, where they need the fields because they 
don't have the schema... which is why some people suggested separating a) from 
b)
2) I still need to move over unit tests that I have outside of the library.
3) there is more potential for improvement here, for example: union {null, int, 
string}, union {double, record}... can also be jsonized better, which I have on 
my todo list, and will be in my implementation sometime in the next 6 months... 
this might change the approach the current implementation takes...

Unfortunately my time available for this is limited... and since our use cases 
are covered in the fork we use, this is currently low priority in my list...

> Json serialization of nullable fileds and fields with default values 
> improvement.
> ---------------------------------------------------------------------------------
>
>                 Key: AVRO-1582
>                 URL: https://issues.apache.org/jira/browse/AVRO-1582
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.8.0
>            Reporter: Zoltan Farkas
>         Attachments: AVRO-1582-PATCH
>
>
> Currently serializing a nullable field of type union like:
> "type" : ["null","some type"]
> when serialized as JSON results in:  
> "field":{"some type":"value"}
> when it could be:
> "field":"value"
> Also fields that equal the the default value can be omitted from the 
> serialized data. This is possible because the reader will have the writer's 
> schema and can infer the field values. This reduces the size of the json 
> messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to