[jira] [Commented] (FLINK-34704) Process checkpoint barrier in AsyncWaitOperator when the element queue is full

Zakelly Lan (Jira) Thu, 11 Apr 2024 01:46:04 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-34704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836063#comment-17836063
 ]


Zakelly Lan commented on FLINK-34704:
-------------------------------------

Actually, even if we don't checkpoint while it is calling {{{}yield(){}}}, the 
checkpoint still happen when some user code is running. You may see my 
investigation in ML ( 
[https://lists.apache.org/thread/4f7ywn29kdv4302j2rq3fkxc6pc8myr2] ), where all 
the records in queue actually invoked the `asyncInvoke` and are waiting for 
their result, and at this moment a checkpoint barrier mail comes which is able 
to trigger and finish the cp normally. The key point is the user code is 
stateless, only reading some external databases, thus the checkpoint can 
proceed in the meantime.

I agree that the {{StreamTask}} should prioritize taking a checkpoint over 
processing mails that are already enqueued. But this ticket is actually 
introducing a possible optimization for very specific scenario and only for 
this. For the original problem described in ML, I don't think the FLINK-35051 
could resolve it. Yet I do think FLINK-35051 is more important and we could pay 
more attention on it.

> Process checkpoint barrier in AsyncWaitOperator when the element queue is full
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-34704
>                 URL: https://issues.apache.org/jira/browse/FLINK-34704
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Task
>            Reporter: Zakelly Lan
>            Priority: Minor
>
> As discussed in 
> https://lists.apache.org/thread/4f7ywn29kdv4302j2rq3fkxc6pc8myr2 . Maybe it 
> is better to provide such a new `yield` that can process mail with low 
> priority in the mailbox executor. More discussion needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-34704) Process checkpoint barrier in AsyncWaitOperator when the element queue is full

Reply via email to