Hi Yong,

Big +1 for moving the committer logics into operator coordinator.

Regarding the design, I have only one question: What happens if JM fails
after a checkpoint is completed but before the
`OperatorCoordinator#notifyCheckpointComplete()` is called?

The JavaDoc of CheckpointListener states that, `notifyCheckpointComplete`
and `notifyCheckpointAborted` are best efforts. If a failure occurs
directly after the checkpoint is completed, the coordinator may not get the
notification. While this PIP already described the scenario that an
uncompleted checkpoint will be retried after recovery, IIUC, a completed
but not notified checkpoint will not be re-notified. Does this mean
`OperatorCoordinator#notifyCheckpointComplete()` may never be called for
that checkpoint, thus the table meta of the corresponding snapshot will be
left incomplete?

Best,

Xintong



On Wed, Feb 5, 2025 at 4:00 PM xiangyu feng <xiangyu...@gmail.com> wrote:

> Hi Yong,
>
> Thx for initiating this discussion. I like the idea to remove the global
> committer operator in DAG to improve the overall streaming write stability.
> I have following questions:
>
>    1. The benefit of this improvement is that when Flink writes to append
>    table without bucket it can support region failover. However, "region
>    failover" seems like an internal concept to community users. AFAIK,
> there
>    is only one concept as "pipelined region" in Flink scheduling and there
> are
>    no concepts named "region" in Flink checkpoint. I'm not sure these two
>    concepts are equivalent. Maybe we should add more descriptions here.
>    2. Besides supporting region failover, are there any other benefits to
>    this feature?
>
>
> Best,
> Xiangyu Feng
>
> Yong Fang <zjur...@gmail.com> 于2025年1月24日周五 15:51写道:
>
> > Thanks zhanghao, we can do this in flink too
> >
> > Best,
> > Fang Yong
> >
> > Zhanghao Chen <zhanghao.c...@outlook.com> 写道:
> >
> > > Hi Yong,
> > >
> > > Thanks for raising it! It is a common problem shared by all sinks using
> > > the global committer pattern. Would it be better to initiate a
> discussion
> > > in the Flink community as well?
> > >
> > > Best,
> > > Zhanghao Chen
> > > ________________________________
> > > From: Jingsong Li <jingsongl...@gmail.com>
> > > Sent: Thursday, January 23, 2025 20:38
> > > To: dev@paimon.apache.org <dev@paimon.apache.org>
> > > Subject: Re: [DISCUSS] PIP-30: Improvement For Paimon Committer In
> Flink
> > >
> > > Thanks Yong!
> > >
> > > Looks fantastic! We can discuss this in detail after the Spring
> Festival.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Thu, Jan 23, 2025 at 5:44 PM Yong Fang <zjur...@gmail.com> wrote:
> > > >
> > > > Hi devs,
> > > >
> > > > I would like to start a discussion about PIP-30: Improvement For
> Paimon
> > > > Committer In Flink [1].
> > > >
> > > > Currently Flink writes data to Paimon based on Two-Phase Commit which
> > > will
> > > > generate a global committer node and connect all tasks in one region.
> > If
> > > > any task fails, it will lead to a global failover in Flink job.
> > > >
> > > > To solve this issue, we would like to introduce a Paimon Writer
> > > Coordinator
> > > > to perform table commit operation, enabling Flink paimon jobs to
> > support
> > > > region failover and improving stability.
> > > >
> > > > Looking forward to hearing from you, thanks!
> > > >
> > > > [1]
> > > > https://cwiki.apache.org/confluence/display/PAIMON/PIP-
> > > 30%3A+Improvement+For+Paimon+Committer+In+Flink
> > > >
> > > >
> > > > Best,
> > > > Fang Yong
> > >
> >
>

Reply via email to