Thanks for the clarification, Yong. Sounds good to me. Best,
Xintong On Wed, Feb 12, 2025 at 10:15 AM wj wang <hongli....@gmail.com> wrote: > To @Yanquan Lv > The concept of 'Region' is in Blink or old Flink version. > Now the 'Region' is means `Restart Pipelined Region Failover Strategy` in > Flink. > > On Wed, Feb 12, 2025 at 9:35 AM Yanquan Lv <decq12y...@gmail.com> wrote: > > > > Hi Fang Yong. Thanks for driving it. > > Adding an alternative way to complete the global commit is good for me. > > > > I still have a confusion similar with xiangyu feng, > > > This committer node will connect all the tasks in Flink job together, > and all the tasks are within one region. As a result, when any task fails, > it will trigger a global failover of the Flink job. > > Why would the failure of a subtask have impact on other subtasks? The > explanation on how to form a region in the Flink document is not > particularly detailed. Can you explain more on it. > > > > > > > 2025年2月11日 16:55,Yong Fang <zjur...@gmail.com> 写道: > > > > > > Thanks for all the feedback. > > > > > > To @xiangyu feng: > > > 1. Regarding region failover, I'm referring to the `Restart Pipelined > > > Region Failover Strategy` [1] in Flink. > > > 2. This feature mainly enables support for region failover, enhancing > > > stability. Of course, in practical use, users can configure the JM to > > > provide more suitable resources for the committer node. > > > > > > To @Xintong Song: > > > It doesn't matter if the notifyComplete of one checkpoint fails to be > > > called. Since the data will be stored in the HDFS path of the > coordinator, > > > when the notifyComplete of the next CP is called, the data files > generated > > > by the two CPs will be merged to create a Paimon snapshot. > > > > > > To @Yunfeng Zhou: > > > The pressure on the JM (JobManager) depends on two aspects: > > > Message communication pressure: It is determined by the checkpoint (CP) > > > interval and the number of sinks. Each sink generates one message in > each > > > CP. If there are a total of 5000 sinks, the JM will receive 5000 > messages > > > in one CP, which may take about a few seconds. > > > Pressure from executing table commits: It is necessary to create a > Paimon > > > snapshot, and compaction may even be triggered. Therefore, asynchronous > > > threads should be considered for these operations to avoid blocking the > > > main thread of the JM. > > > > > > To @Jingsong and @wj wang > > > When using Flink SinkV2 to write to Paimon, a global committer node > will > > > also be generated in the physical execution plan to create and manage > > > Paimon snapshots. Therefore, the same problem arises > > > > > > [1] > > > > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/task_failure_recovery/#restart-pipelined-region-failover-strategy > > > > > > Best, > > > Fang Yong > > > > > > On Tue, Feb 11, 2025 at 11:26 AM wj wang <hongli....@gmail.com> wrote: > > > > > >> Hi Yong, thanks for driving this PIP. > > >> I have a small question: > > >> Why not use Flink SinkV2 instead of moving the committer logic into > > >> JM's OperatorCoordinator? > > >> > > >> On Tue, Feb 11, 2025 at 10:34 AM Jingsong Li <jingsongl...@gmail.com> > > >> wrote: > > >>> > > >>> Thanks Yong for driving this PIP! > > >>> > > >>>>> Currently, there will be a global committer node in Flink Paimon > job > > >> which is used to ensure the consistency of written data in Paimon. > This > > >> committer node will connect all the tasks in Flink job together, and > all > > >> the tasks are within one region. As a result, when any task fails, it > will > > >> trigger a global failover of the Flink job. We use HDFS as the remote > > >> storage, and we often encounter situations where the global failover > of > > >> jobs is triggered due to write timeouts or errors when writing to > HDFS, > > >> which are quite a few stability issues. > > >>> > > >>> I know that Flink SinkV2 is also committed through a regular node, > > >>> does this mean that SinkV2 also has this drawback? > > >>> > > >>> Best, > > >>> Jingsong > > >>> > > >>> On Mon, Feb 10, 2025 at 4:06 PM Yunfeng Zhou > > >>> <flink.zhouyunf...@gmail.com> wrote: > > >>>> > > >>>> Hi Yong, > > >>>> > > >>>> The general idea looks good to me. Is there any statistics on the > > >> number of operator events that need to be transmitted between the > > >> coordinator and the writer operators? This information could help > provide > > >> estimations on the additional workload to the JM, preventing the JM > from > > >> being a single bottleneck to the throughput of Paimon sinks. > > >>>> > > >>>> Best, > > >>>> Yunfeng > > >>>> > > >>>>> 2025年1月23日 17:44,Yong Fang <zjur...@gmail.com> 写道: > > >>>>> > > >>>>> Hi devs, > > >>>>> > > >>>>> I would like to start a discussion about PIP-30: Improvement For > > >> Paimon > > >>>>> Committer In Flink [1]. > > >>>>> > > >>>>> Currently Flink writes data to Paimon based on Two-Phase Commit > > >> which will > > >>>>> generate a global committer node and connect all tasks in one > > >> region. If > > >>>>> any task fails, it will lead to a global failover in Flink job. > > >>>>> > > >>>>> To solve this issue, we would like to introduce a Paimon Writer > > >> Coordinator > > >>>>> to perform table commit operation, enabling Flink paimon jobs to > > >> support > > >>>>> region failover and improving stability. > > >>>>> > > >>>>> Looking forward to hearing from you, thanks! > > >>>>> > > >>>>> [1] > > >>>>> > > >> > https://cwiki.apache.org/confluence/display/PAIMON/PIP-30%3A+Improvement+For+Paimon+Committer+In+Flink > > >>>>> > > >>>>> > > >>>>> Best, > > >>>>> Fang Yong > > >>>> > > >> > > >