[ https://issues.apache.org/jira/browse/KAFKA-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269148#comment-15269148 ]
Ewen Cheslack-Postava commented on KAFKA-3209: ---------------------------------------------- To help clarify [~Skandragon]'s comment a bit, the idea is that the records are going to be small compared to the headers. This means that the approach we might normally suggest -- doing the flatMap transformation with an application or stream processor, storing that data back to Kafka, then using Connect to store the data to another system -- will have very high overhead. Whereas most of the message transforms we've discussed so far are either simple map() or filter() transformations, this is a case where we might want to generate multiple output messages from a single input message. The API for supporting this is obviously straightforward -- just support returning a list of messages from the transformation instead of a single message. However, I think the main challenge is that message offsets either aren't unique anymore or we'd need to extend the concept of offset to account for "sub-messages". > Support single message transforms in Kafka Connect > -------------------------------------------------- > > Key: KAFKA-3209 > URL: https://issues.apache.org/jira/browse/KAFKA-3209 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect > Reporter: Neha Narkhede > > Users should be able to perform light transformations on messages between a > connector and Kafka. This is needed because some transformations must be > performed before the data hits Kafka (e.g. filtering certain types of events > or PII filtering). It's also useful for very light, single-message > modifications that are easier to perform inline with the data import/export. -- This message was sent by Atlassian JIRA (v6.3.4#6332)