[
https://issues.apache.org/jira/browse/FLINK-33058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765669#comment-17765669
]
Ryan Skraba commented on FLINK-33058:
-------------------------------------
Hello! I can review this, but in my experience, Avro-encoded JSON is good only
for debugging or human-readable "previews" of data. I haven't run across the
use of JSON encoding in production (it's typically larger, slower and the
temptation to use JSON tools on it is counterproductive!) In my opinion, this
has proven to be especially true for persistent messages.
If I can ask: what are the circumstances where a user would choose Avro, but
want something other than the binary encoding?
> Support for JSON-encoded Avro
> -----------------------------
>
> Key: FLINK-33058
> URL: https://issues.apache.org/jira/browse/FLINK-33058
> Project: Flink
> Issue Type: Improvement
> Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
> Reporter: Dale Lane
> Priority: Minor
> Labels: avro, flink, flink-formats, pull-request-available
>
> Avro supports two serialization encoding methods: binary and JSON
> cf. [https://avro.apache.org/docs/1.11.1/specification/#encodings]
> flink-avro currently has a hard-coded assumption that Avro data is
> binary-encoded (and cannot process Avro data that has been JSON-encoded).
> I propose adding a new optional format option to flink-avro: *avro.encoding*
> It will support two options: 'binary' and 'json'.
> It unset, it will default to 'binary' to maintain compatibility/consistency
> with current behaviour.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)