Alaksiej Ščarbaty created NIFI-15175:
----------------------------------------

             Summary: Escape json field names in JsonRecordSetWriter when 
writing Avro schema
                 Key: NIFI-15175
                 URL: https://issues.apache.org/jira/browse/NIFI-15175
             Project: Apache NiFi
          Issue Type: Improvement
    Affects Versions: 2.6.0
            Reporter: Alaksiej Ščarbaty


When `JsonRecordSetWriter` with `Set 'avro.schema' Attribute` encounters a 
record field with an invalid [Avro 
name|https://avro.apache.org/docs/1.12.0/specification/#names] (e.g. one that 
contains a hyphen) it fails with an error, even if the field name is valid from 
a json perspective.

This complicates schema usage for json-encoded records.

Instead, the field name should be escaped before being written into an Avro 
schema. The original field name can be stored in a field 
[alias|https://avro.apache.org/docs/1.12.0/specification/#aliases].

During deserialization (e.g. by `JsonTreeReader`) the fields with escaped names 
should use aliases when `RecordSchema` is created.

*Important: the written FlowFile content must not change.*

_Perhaps, instead of changing singular writers and readers, the escaping logic 
should be introduced in 
[AvroTypeUtil|https://github.com/apache/nifi/blob/main/nifi-extension-bundles/nifi-extension-utils/nifi-record-utils/nifi-avro-record-utils/src/main/java/org/apache/nifi/avro/AvroTypeUtil.java]._



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to