[
https://issues.apache.org/jira/browse/GOBBLIN-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhixiong Chen updated GOBBLIN-238:
----------------------------------
Description:
h3. Why
The current implementation of EnvelopeSchemaConverter has several flaws:
- Assumes top level payload schema field
- Output record is the schema'ed payload but output schema is a String
To address the issues and improve envelope schema conversion, the task
implements two types of EnvelopeSchemaConverter: EnvelopePayloadConverter and
EnvelopeSchemaDecorator.
h3. EnvelopePayloadConverter
Given an envelope record, the output schema will be the latest payload schema
fetched from a kafka registry. The output record will be the deserialized
payload with the latest schema
h3. EnvelopeSchemaDecorator
Given an envelope record, the output schema will set the payload field to have
the latest schema fetched from a kafka registry and set the other fields as
they are from the input schema. The output record will set the payload to be
the deserialized object with the latest schema and set the other fields as they
are from the input record
h3. Configurations
One configuration is required to set for any of the converters to work. It has
no default value.
{code:java}
// The topic to fetch the latest schema of the payload from a kafka registry
converter.envelopeSchemaConverter.payloadSchemaTopic=
{code}
The converter supports nested schema id
{code:java}
converter.envelopeSchemaConverter.schemaIdField="metadata.payloadSchemaId"
{code}
was:
The current implementation of EnvelopeSchemaConverter has several flaws:
- Assumes top level payload schema field
- Output record is the schema'ed payload but output schema is a String
The task implements two types of EnvelopeSchemaConverter:
EnvelopePayloadConverter and
{code:java}
converter.envelopeSchemaConverter.schemaIdField="metadata.payloadSchemaId"
{code}
> Implement EnvelopePayloadConverter and EnvelopeSchemaDecorator
> --------------------------------------------------------------
>
> Key: GOBBLIN-238
> URL: https://issues.apache.org/jira/browse/GOBBLIN-238
> Project: Apache Gobblin
> Issue Type: Task
> Reporter: Zhixiong Chen
> Assignee: Zhixiong Chen
> Labels: Core:Converter
>
> h3. Why
> The current implementation of EnvelopeSchemaConverter has several flaws:
> - Assumes top level payload schema field
> - Output record is the schema'ed payload but output schema is a String
> To address the issues and improve envelope schema conversion, the task
> implements two types of EnvelopeSchemaConverter: EnvelopePayloadConverter and
> EnvelopeSchemaDecorator.
> h3. EnvelopePayloadConverter
> Given an envelope record, the output schema will be the latest payload schema
> fetched from a kafka registry. The output record will be the deserialized
> payload with the latest schema
> h3. EnvelopeSchemaDecorator
> Given an envelope record, the output schema will set the payload field to
> have the latest schema fetched from a kafka registry and set the other fields
> as they are from the input schema. The output record will set the payload to
> be the deserialized object with the latest schema and set the other fields as
> they are from the input record
> h3. Configurations
> One configuration is required to set for any of the converters to work. It
> has no default value.
> {code:java}
> // The topic to fetch the latest schema of the payload from a kafka registry
> converter.envelopeSchemaConverter.payloadSchemaTopic=
> {code}
> The converter supports nested schema id
> {code:java}
> converter.envelopeSchemaConverter.schemaIdField="metadata.payloadSchemaId"
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)