[
https://issues.apache.org/jira/browse/AVRO-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056701#comment-15056701
]
Ryan Blue commented on AVRO-1584:
---------------------------------
It looks like the conversion used for default values is independent of
toString. Callers can pass either a JsonNode, which bypasses the problem, or an
object that gets [converted in
JacksonUtils|https://github.com/apache/avro/blob/trunk/lang/java/avro/src/main/java/org/apache/avro/util/internal/JacksonUtils.java#L73].
That converts a byte array to a string using ISO-8859-1, which correctly
implements the spec. When the JSON is written, the characters that aren't
allowed in JSON strings are escaped by the generator. Changing the output of
toString won't break the case that Doug mentions, but I think it is a fair
point that changing what is currently produced could break applications.
However, the JSON currently produced by toString is broken because it doesn't
convert control characters to escape sequences (0x0a to \n). We could safely
fix that problem without moving to base64 and I think at a minimum we should do
that.
But this still leaves a problem: what do we do about toString not conforming to
the JSON required by the Avro spec?
> Json output doesn't generate base64 for byte arrays
> ---------------------------------------------------
>
> Key: AVRO-1584
> URL: https://issues.apache.org/jira/browse/AVRO-1584
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.7.7
> Environment: Pure java.
> Reporter: Christophe Lorenz
> Attachments: AVRO-1584-Jackson-Base64-Default-Variant.patch,
> AVRO-1584.patch
>
>
> The Json output of java generated code doesn't correctly encode byte arrays.
> Using this simple schema :
> {"namespace": "example.avro",
> "type": "record",
> "name": "ByteArrayEncoding",
> "fields": [ {"name": "data", "type": "bytes"} ]
> }
> The toString()
> System.out.println(new ByteArrayEncoding(ByteBuffer.wrap(new
> byte[]{0,31,65,66,67,(byte)255,(byte)182})));
> Returns raw bytes to string in the json :
> {"data": {"bytes": " ABC??"}}
> As a byte array is not tied to be a valid string, it should be converted back
> and forth to Base64 like other Json implementations :
> {"data": {"bytes": "AB9BQkP/tg=="}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)