[
https://issues.apache.org/jira/browse/AVRO-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050739#comment-14050739
]
Doug Cutting commented on AVRO-1533:
------------------------------------
It won't generate runtime errors for invalid UTF-8, but instead replaces
erroneous sequences with the character "�":
http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#String(byte[],%20java.nio.charset.Charset)
I think can be considered a compatible change, since it won't break existing
applications. Today attempts to switch a field from bytes to string would
fail. I suppose an application could currently rely on such failures, but I
consider that unlikely enough that I'm willing to ignore it. Do others
disagree?
We could:
# revert this change entirely, declaring it incompatible
# revert just the change to the specification, so that Avro Java is more
lenient in what conversions it permits than the specification (following
Postel's law)
# file issues to update the AVRO-1315 schema validation to permit such
conversions
- also file issues for C, C++ and C# to update their schema resolution to
support these conversions
Thoughts?
> permit promotions between string and bytes
> ------------------------------------------
>
> Key: AVRO-1533
> URL: https://issues.apache.org/jira/browse/AVRO-1533
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Doug Cutting
> Assignee: Doug Cutting
> Fix For: 1.7.7
>
> Attachments: AVRO-1533.patch, AVRO-1533.patch
>
>
> Avro strings are a subset of bytes, so promoting from string to bytes is
> lossless and should be possible. Promotion from bytes to strings may cause
> problems, as not all byte strings are valid UTF8, but it also might be useful.
--
This message was sent by Atlassian JIRA
(v6.2#6252)