[ 
https://issues.apache.org/jira/browse/KAFKA-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269148#comment-15269148
 ] 

Ewen Cheslack-Postava commented on KAFKA-3209:
----------------------------------------------

To help clarify [~Skandragon]'s comment a bit, the idea is that the records are 
going to be small compared to the headers. This means that the approach we 
might normally suggest -- doing the flatMap transformation with an application 
or stream processor, storing that data back to Kafka, then using Connect to 
store the data to another system -- will have very high overhead.

Whereas most of the message transforms we've discussed so far are either simple 
map() or filter() transformations, this is a case where we might want to 
generate multiple output messages from a single input message. The API for 
supporting this is obviously straightforward -- just support returning a list 
of messages from the transformation instead of a single message. However, I 
think the main challenge is that message offsets either aren't unique anymore 
or we'd need to extend the concept of offset to account for "sub-messages".

> Support single message transforms in Kafka Connect
> --------------------------------------------------
>
>                 Key: KAFKA-3209
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3209
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Neha Narkhede
>
> Users should be able to perform light transformations on messages between a 
> connector and Kafka. This is needed because some transformations must be 
> performed before the data hits Kafka (e.g. filtering certain types of events 
> or PII filtering). It's also useful for very light, single-message 
> modifications that are easier to perform inline with the data import/export.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to