[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-10-04 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424298#comment-17424298
 ] 

Matthias J. Sax commented on KAFKA-13326:
-

Neither mirror-maker2 not replicator support EOS semantics while Kafka Streams 
does. EOS is based on Kafka's transaction that don't work cross-cluster.

Both, mirror-maker2 and replicator are "simple" copy-data tools, while Kafka 
Streams, that does processing, is much more complicated. For example, Kafka 
Streams might create additional topics to repartition data for processing, or 
create changelog topics for stateful processing. If there are two clusters, 
it's unclear in which those topics should be created. Also, it would be 
required not have a single consumer and producer, but to have multiple to 
read/write to both clusters. There is also complications with consumer group 
management if there are two cluster the application would connect to, as it's 
unclear which of both is really in charge (split brain problem).

While it's not impossible to built, it's rather complex.

> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-09-29 Thread Guangyuan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422484#comment-17422484
 ] 

Guangyuan Wang commented on KAFKA-13326:


[~bchen225242] Thank you very much for the information provided.
 And I am a little curious about whether mirror-maker2 or other replicators 
face the same challenges?
 Why replicators could play well in cross-cluster, but Kafka stream can't?

> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-09-29 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421952#comment-17421952
 ] 

Matthias J. Sax commented on KAFKA-13326:
-

PR to update the web-page: https://github.com/apache/kafka-site/pull/376

> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-09-27 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421107#comment-17421107
 ] 

Boyang Chen commented on KAFKA-13326:
-

[~wangguangyuan] Thanks for your interest. There are some known challenges to 
support cross-cluster processing:
 # The exactly-once semantic will break because Kafka couldn't do cross-cluster 
transaction
 # The topic tracking will need to be augmented with multi-cluster support in 
mind, which is a significant amount of work
 # The failure scenario will be complicated, as before it is within one broker 
cluster but now multiple, any pair of topic-cluster mapping could fail and 
cause a weird state to recover

Considering numerous replicators existing on the market, it's not a top 
priority to support that natively in Kafka Streams. [~guozhang] [~mjsax]could 
give more insights here.

> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-09-27 Thread Guangyuan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421100#comment-17421100
 ] 

Guangyuan Wang commented on KAFKA-13326:


[~mjsax] 
Thank you for your answer.  

Cross-cluster data processing is very important for my product. As I'd like to 
use Kafka stream to filter messages from one cluster to another, this will 
reduce the number of the message need to be copied. Without cross-cluster data 
processing, I could only use mirror-maker2. This will cause a large number of 
messages to be copied from one cluster to another.

Could I know the reason, why there is no plan atm to support cross-cluster data 
processing?
And I will be glad to contribute this feature if you think this is a good 
feature.


> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)