[
https://issues.apache.org/jira/browse/BEAM-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655604#comment-16655604
]
Raghu Angadi edited comment on BEAM-5786 at 10/18/18 5:15 PM:
--------------------------------------------------------------
Sounds good.
More I think about completely dynamic case, I think it needs to be an SDF to
handle the dynamic partitions correctly.
One problem I doin't know how to solve correctly in general case using current
unbounded source API that requires fixed number of splits is about watermark
for empty splits:
Say {{topics_X}} has a one partition and it is assigned to {{split_A}} at
runtime. Now {{topic_X}} gets deleted and lets say there are no more partitions
assigned to split_A. What should be the watermark reported by split_A? If we
knew that no more partitions assigned to split_A ever in future, we could
report MAX_TIME.
If there is no work around for this, we can support adding partitions and
topics dynamically at runtime but should require that they not be deleted. With
that, it is a fairly straight forward implementation.
was (Author: rangadi):
Sounds good.
More I think about completely dynamic case, I think it needs to be an SDF to
handle the dynamic partitions correctly.
One problem I doin't know how to solve correctly in general case using current
unbounded source API which would use fixed partitions is about watermark for
empty splits:
Say {{topics_X}} has a one partition and it is assigned to {{split_A}} at
runtime. Now {{topic_X}} gets deleted and lets say there are no more partitions
assigned to split_A. What should be the watermark reported by split_A? If we
knew that no more partitions assigned to split_A ever in future, we could
report MAX_TIME.
If there is no work around for this, we can support adding partitions and
topics dynamically at runtime but should require that they not be deleted. With
that, it is a fairly straight forward implementation.
> KafkaIO should support topic patterns
> -------------------------------------
>
> Key: BEAM-5786
> URL: https://issues.apache.org/jira/browse/BEAM-5786
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kafka
> Reporter: Alexey Romanenko
> Assignee: Alexey Romanenko
> Priority: Major
>
> For the moment, it is possible to set topics for KafkaIO.Read in 2 ways - one
> single topic by using {{withTopic(String)}} or list of topics by using
> {{withTopics(List<String>)}}.
> It would make sense to provide another way for this using patterns, based on
> regular expressions, like {{withTopics(“org_[0-9]+_source”)}}. In this case,
> Kafka consumer could discover the topics based on user provided pattern.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)