[ 
https://issues.apache.org/jira/browse/BEAM-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655604#comment-16655604
 ] 

Raghu Angadi commented on BEAM-5786:
------------------------------------

Sounds good.

More I think about completely dynamic case, I think it needs to be an SDF to 
handle the dynamic partitions correctly.

One problem I doin't know how to solve correctly in general case using current 
unbounded source API which would use fixed partitions is about watermark for 
empty splits: 

Say topics_x a single partition as it was assigned to split_a at runtime. Now 
lets say topic_x is deleted later and there no more partitions assigned to 
split_a. What should the watermark reported by split_a? If we knew that no more 
partitions assigned to split_a ever in future, we could report MAX_TIME for 
watermark.

If there is no work around for this, we can support adding partitions and 
topics dynamically at runtime but should require that they be not deleted. With 
that it is a fairly straight forward implementation. 

> KafkaIO should support topic patterns
> -------------------------------------
>
>                 Key: BEAM-5786
>                 URL: https://issues.apache.org/jira/browse/BEAM-5786
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-kafka
>            Reporter: Alexey Romanenko
>            Assignee: Alexey Romanenko
>            Priority: Major
>
> For the moment, it is possible to set topics for KafkaIO.Read in 2 ways - one 
> single topic by using {{withTopic(String)}} or list of topics by using 
> {{withTopics(List<String>)}}. 
> It would make sense to provide another way for this using patterns, based on 
> regular expressions, like {{withTopics(“org_[0-9]+_source”)}}. In this case, 
> Kafka consumer could discover the topics based on user provided pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to