[ 
https://issues.apache.org/jira/browse/AVRO-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15362853#comment-15362853
 ] 

Sean Busbey commented on AVRO-1875:
-----------------------------------

{quote}
The reason why I want to have Base64 - as of now byte arrays in JSON are 
encoded in Latin-1 encoding, which couldn't be displayed by UTF-8. According to 
JSON specification (https://tools.ietf.org/html/rfc7159#section-8.1) JSON 
should be encoded in UTF-8, 16 or 32, otherwise it is not valid, which causes a 
lot of problems when you try parse and work with them with Json libs. 
{quote}

This sounds like a bug, separate from the issue of having a base64 encoding. We 
shouldn't be encoding things counter to the JSON spec. I don't think we can 
ensure an arbitrary array of bytes is necessarily valid UTF-8, so I should 
probably read up on what we're doing right now (which also gets back to my 
prior question about if we have a spec for the JSON encoding).

{quote}
If you write as it was earlier:
{code}
JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema, 
recordInString);
{code}
you will get old version of JsonDecoder, which encodes in Latin-1.
To get Base64 with this patch you need write this:
{code}
boolean base64 = true;
JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema, 
recordInString, base64);
{code}
This boolean variable is a member of JsonEncoder and JsonDecoder class and has 
default value "false", which means using Latin-1.
{quote}

Right, I get how the mechanics of the change work when a developer is picking 
what gets used on each side of the serialization. What I'm not clear on is how 
given just serialized data one determines which decoding should be used 
(preferably where "one" is not a human but instead some application or library).

> ability encode/decode byte arrays in Base64
> -------------------------------------------
>
>                 Key: AVRO-1875
>                 URL: https://issues.apache.org/jira/browse/AVRO-1875
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Paul Dudenkov
>             Fix For: 1.9.0
>
>
> Hi, 
> I would like to add ability encode/decode byte arrays in Base64 in classes 
> JsonEncoder JsonDecoder.
> For this purpose I will add new constructors and overloaded factory methods.
> Old code will work as before.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to