[ 
https://issues.apache.org/jira/browse/BEAM-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655604#comment-16655604
 ] 

Raghu Angadi edited comment on BEAM-5786 at 10/18/18 5:16 PM:
--------------------------------------------------------------

Sounds good.

More I think about completely dynamic case, I think it needs to be an SDF to 
handle the dynamic partitions correctly.

One problem I doin't know how to solve correctly in general case using current 
unbounded source API that requires fixed number of splits is about watermark 
for empty splits: 

Say {{topic_X}} has one partition and it is assigned to {{split_A}} at runtime. 
Now {{topic_X}} gets deleted and lets say there are no more partitions assigned 
to split_A. What should be the watermark reported by split_A? If we knew that 
no more partitions assigned to split_A ever in future, we could report MAX_TIME.

If there is no work around for this, we can support adding partitions and 
topics dynamically at runtime but should require that they not be deleted. With 
that, it is a fairly straight forward implementation. 


was (Author: rangadi):
Sounds good.

More I think about completely dynamic case, I think it needs to be an SDF to 
handle the dynamic partitions correctly.

One problem I doin't know how to solve correctly in general case using current 
unbounded source API that requires fixed number of splits is about watermark 
for empty splits: 

Say {{topics_X}} has a one partition and it is assigned to {{split_A}} at 
runtime. Now {{topic_X}} gets deleted and lets say there are no more partitions 
assigned to split_A. What should be the watermark reported by split_A? If we 
knew that no more partitions assigned to split_A ever in future, we could 
report MAX_TIME.

If there is no work around for this, we can support adding partitions and 
topics dynamically at runtime but should require that they not be deleted. With 
that, it is a fairly straight forward implementation. 

> KafkaIO should support topic patterns
> -------------------------------------
>
>                 Key: BEAM-5786
>                 URL: https://issues.apache.org/jira/browse/BEAM-5786
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-kafka
>            Reporter: Alexey Romanenko
>            Assignee: Alexey Romanenko
>            Priority: Major
>
> For the moment, it is possible to set topics for KafkaIO.Read in 2 ways - one 
> single topic by using {{withTopic(String)}} or list of topics by using 
> {{withTopics(List<String>)}}. 
> It would make sense to provide another way for this using patterns, based on 
> regular expressions, like {{withTopics(“org_[0-9]+_source”)}}. In this case, 
> Kafka consumer could discover the topics based on user provided pattern.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to