Re: Moving data from one cluster to another with Kafka Streams

2018-02-04 Thread Geoffrey Holmes

> Kafka Streams only work with a single cluster.

Ok, that’s what I was thinking after I looked at it more.

> Thus, you would need to either transform the data first and replicate
> the output topic to the target cluster, or replicate first and transform
> within the target cluster.

I don’t control the source cluster, so the second is the only possibility.

> Note, for the "intermediate" topic you need, you can set a low retention
> time to reduce storage footprint, as it only acts as a temporal topic
> and the actual data is safely stored in the original source and final
> target topic.

Makes sense.

> As an alternative, you might want to check out
> "single-message-transforms" (SMT) using Kafka Connect. Those allow you
> to do simple transformation on the fly while copying data around. If you
> don't need advanced transformations like aggregations or joins, SMT
> might be sufficient and you don't need to use Kafka Streams.

I started to look into that. The transformation I need to do is really 
light-weight. What I don’t quite understand is, do have to make source and sink 
connectors to get records from the one Kafka topic and write the transformed 
records to the destination Kafka topic? Or can I write a consumer and a 
producer and incorporate SMTs that way?

> -Matthias

>> On 2/2/18 2:54 PM, Geoffrey Holmes wrote:
>> I need to get messages from a topic in one Kafka cluster, transform the 
>> message payload,
>> and put the messages into topics in another Kafka cluster. Is it possible to 
>> do this with
>> Kafka Streams? I don’t see how I can configure the stream to use one cluster 
>> for the input
>> and another cluster for output.



Re: Moving data from one cluster to another with Kafka Streams

2018-02-02 Thread Matthias J. Sax
Kafka Streams only work with a single cluster.

Thus, you would need to either transform the data first and replicate
the output topic to the target cluster, or replicate first and transform
within the target cluster.

Note, for the "intermediate" topic you need, you can set a low retention
time to reduce storage footprint, as it only acts as a temporal topic
and the actual data is safely stored in the original source and final
target topic.

As an alternative, you might want to check out
"single-message-transforms" (SMT) using Kafka Connect. Those allow you
to do simple transformation on the fly while copying data around. If you
don't need advanced transformations like aggregations or joins, SMT
might be sufficient and you don't need to use Kafka Streams.


-Matthias

On 2/2/18 2:54 PM, Geoffrey Holmes wrote:
> I need to get messages from a topic in one Kafka cluster, transform the 
> message payload, and put the messages into topics in another Kafka cluster. 
> Is it possible to do this with Kafka Streams? I don’t see how I can configure 
> the stream to use one cluster for the input and another cluster for output.
> 
> 



signature.asc
Description: OpenPGP digital signature