eolivelli opened a new issue #9844: URL: https://github.com/apache/pulsar/issues/9844
We have the `KeyValue` schema that supports a generic key-value model, and both the key and the value have a schema. When you are dealing with structured data types, currently you usually use `Sink<GenericRecord>` and the `AUTO_CONSUME` schema, this way you can deal automatically with any supported from of data structures. But if you use `AUTO_CONSUME` you cannot consume `KeyValue` records. **Describe the solution you'd like** I would like to see a way to use `AUTO_CONSUME` that in case of `KeyValue` schema, it passes a special `GenericRecord` instance with two fields: - key - value GenericRecord already supports nested data structures, so it is possible to set the schema for the key field and for the value field. Advanced processors that allow to deal with nested structures will benefit from this new feature, because they will automatically be able to deal with KeyValue without changes, and in a consistent way, that is to deal only with GenericRecord, that is the generic key-value dictionary we have in Pulsar. **Describe alternatives you've considered** Modifying all of the connectors to deal with KeyValue and with GenericRecord, but this will be a big effort, and also currently (2.7.x) you cannot have a Sink that deals with two separate data type (the user must set explicitly a "classname") **Additional context** I have implementations of Sinks that deal with generic data structures and allow the user to transform/map the data before writing to the external system ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
