[
https://issues.apache.org/jira/browse/BEAM-6751?focusedWorklogId=205900&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-205900
]
ASF GitHub Bot logged work on BEAM-6751:
----------------------------------------
Author: ASF GitHub Bot
Created on: 28/Feb/19 18:01
Start Date: 28/Feb/19 18:01
Worklog Time Spent: 10m
Work Description: rangadi commented on issue #7955: [BEAM-6751] Extend
Kafka EOS mode whitelist / Warn instead of throw
URL: https://github.com/apache/beam/pull/7955#issuecomment-468373888
@mxm, warning implies that the transform would run, but not provide EOS
guarantees on Flink, right? Is that ok? IOW, EOS is ignored?
The incompatibility with Flink was discussed on dev list in Aug 2017 when we
added EOS. I am familiar with 2PC commit utility in Flink that was added around
that same time to Flink to support EOS for Kafka produced. I had commented on
those PRs that time. @mxm, warning implies that the transform would run, but
not provide EOS guarantees on Flink, right? Is that ok? IOW, EOS is ignored?
We (including Aljoscha) could not see how 2PC model could be implemented for
Flink in Beam context. See
https://www.mail-archive.com/[email protected]/msg02664.html. This is due to
fundamental difference in how Flink's horizontal checkpointing differs from
per-stage checkpointing in Dataflow and others. Has anything has changed in
that respect?
@kennknowles, this was added even before @RequiresStableInput was finalized
in Beam. If there is better way to achieve the same guarantees, we should
improve the implementation. I agree, checking the runner class is a hack, but
we thought that is an acceptable compromise compared to not implementing EOS at
all. This was in Aug 2017, if there is better way to do it, I don't mind
reimplementing it (or work with another contributor to reimplement).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 205900)
Time Spent: 1h 20m (was: 1h 10m)
> KafkaIO blocks FlinkRunner in EOS mode
> --------------------------------------
>
> Key: BEAM-6751
> URL: https://issues.apache.org/jira/browse/BEAM-6751
> Project: Beam
> Issue Type: Bug
> Components: io-java-kafka, runner-flink
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: Critical
> Fix For: 2.12.0
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> KafkaIO has a validation check which whitelists certain runners capable of
> provide exactly-once semantics:
> {noformat}
> if ("org.apache.beam.runners.direct.DirectRunner".equals(runner)
> || runner.startsWith("org.apache.beam.runners.dataflow.")
> || runner.startsWith("org.apache.beam.runners.spark.") {
> ...
> {noformat}
> The FlinkRunner supports exactly-once checkpointing but is blocked from using
> Kafka's exactly once mode.
> I wonder if such a list is easily maintainable? I think we should replace the
> list with a warning instead.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)