[
https://issues.apache.org/jira/browse/BEAM-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655604#comment-16655604
]
Raghu Angadi edited comment on BEAM-5786 at 10/18/18 5:14 PM:
--------------------------------------------------------------
Sounds good.
More I think about completely dynamic case, I think it needs to be an SDF to
handle the dynamic partitions correctly.
One problem I doin't know how to solve correctly in general case using current
unbounded source API which would use fixed partitions is about watermark for
empty splits:
Say {{topics_X}} has a one partition and it is assigned to {{split_A}} at
runtime. Now {{topic_X}} gets deleted and lets say there are no more partitions
assigned to split_A. What should be the watermark reported by split_A? If we
knew that no more partitions assigned to split_A ever in future, we could
report MAX_TIME.
If there is no work around for this, we can support adding partitions and
topics dynamically at runtime but should require that they not be deleted. With
that, it is a fairly straight forward implementation.
was (Author: rangadi):
Sounds good.
More I think about completely dynamic case, I think it needs to be an SDF to
handle the dynamic partitions correctly.
One problem I doin't know how to solve correctly in general case using current
unbounded source API which would use fixed partitions is about watermark for
empty splits:
Say topics_x a single partition as it was assigned to split_a at runtime. Now
lets say topic_x is deleted later and there no more partitions assigned to
split_a. What should the watermark reported by split_a? If we knew that no more
partitions assigned to split_a ever in future, we could report MAX_TIME for
watermark.
If there is no work around for this, we can support adding partitions and
topics dynamically at runtime but should require that they be not deleted. With
that it is a fairly straight forward implementation.
> KafkaIO should support topic patterns
> -------------------------------------
>
> Key: BEAM-5786
> URL: https://issues.apache.org/jira/browse/BEAM-5786
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kafka
> Reporter: Alexey Romanenko
> Assignee: Alexey Romanenko
> Priority: Major
>
> For the moment, it is possible to set topics for KafkaIO.Read in 2 ways - one
> single topic by using {{withTopic(String)}} or list of topics by using
> {{withTopics(List<String>)}}.
> It would make sense to provide another way for this using patterns, based on
> regular expressions, like {{withTopics(“org_[0-9]+_source”)}}. In this case,
> Kafka consumer could discover the topics based on user provided pattern.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)