[
https://issues.apache.org/jira/browse/FLINK-15670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17050031#comment-17050031
]
Yuan Mei commented on FLINK-15670:
----------------------------------
[~sewen]
Need to chat a bit for two things:
# Redefine the scope of the problem, at least for 1.11;
# Watermark handling when multiple subtasks writing to the same partition
** This is a common problem for intermediate persistency, not just for Kafka
** The current mechanism relies on downstream `ExecutionVertex` to progress
watermark. However, in the case of a sink, there is no such thing as
`downstream OP`.
** I was thinking if there is a coordinator of all subtasks of a
ExecutionJobVertex then the watermark progress logic can be handled in the
coordinator
** I find there is an interface `OperatorCoordinator` that may be able to be
used in this case. But the only two usages of it is under `test`
> Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's
> KeyGroups
> -------------------------------------------------------------------------------------
>
> Key: FLINK-15670
> URL: https://issues.apache.org/jira/browse/FLINK-15670
> Project: Flink
> Issue Type: New Feature
> Components: API / DataStream, Connectors / Kafka
> Reporter: Stephan Ewen
> Priority: Major
> Labels: usability
> Fix For: 1.11.0
>
>
> This Source/Sink pair would serve two purposes:
> 1. You can read topics that are already partitioned by key and process them
> without partitioning them again (avoid shuffles)
> 2. You can use this to shuffle through Kafka, thereby decomposing the job
> into smaller jobs and independent pipelined regions that fail over
> independently.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)