Nicholas Johnson created BEAM-13987:
---------------------------------------

             Summary: Adding Regex based subscriptions to KafkaIo Dynamic Read. 
                 Key: BEAM-13987
                 URL: https://issues.apache.org/jira/browse/BEAM-13987
             Project: Beam
          Issue Type: Improvement
          Components: io-java-kafka
            Reporter: Nicholas Johnson


In https://issues.apache.org/jira/browse/BEAM-11325 the ability for Kafka to 
read from topics dynamically was added.

Along with this change, the ability to use regex to subscribe to topics in a 
dynamic way was discussed in the design for this change. 

[https://docs.google.com/document/d/1FU3GxVRetHPLVizP3Mdv6mP5tpjZ3fd99qNjUI5DT5k/edit?disco=AAAALGbMoak]

Pointing out the idea that subscribing to all topics in a cluster isn't 
particularly useful for most kafka users, where pattern based subscription is a 
very common pattern of use.

[https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#subscribe-java.util.regex.Pattern-]

[https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/WatchKafkaTopicPartitionDoFn.java#L113]

The current implementation does not utilize a pattern in anyway to subscribe to 
topics. The comments on the document mention piggy backing off the existing 
functionality in the KafkaConsumers subscribe method.

However, piggy backing on the existing consumer method is made difficult by the 
per partition subscription method used by beam.

But I believe a simple solution exists, 

[https://github.com/apache/beam/blob/5345834a86c422347556e0bce7e5dd20e4854e44/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1183]

As apart of the with dynamic read method, allow the option to pass a Pattern.

As the watchkafkatopicpartitiondofn does now, call listTopics, and then match 
against the list of topics using the supplied pattern. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to