[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-13698: --- Priority: Not a Priority (was: Minor) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Not a Priority > Labels: auto-deprioritized-critical, auto-deprioritized-major, > stale-minor > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-13698: --- Labels: auto-deprioritized-critical auto-deprioritized-major stale-minor (was: auto-deprioritized-critical auto-deprioritized-major) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Minor but is unassigned and neither itself nor its Sub-Tasks have been updated for 180 days. I have gone ahead and marked it "stale-minor". If this ticket is still Minor, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Minor > Labels: auto-deprioritized-critical, auto-deprioritized-major, > stale-minor > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-13698: --- Fix Version/s: (was: 1.15.0) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Minor > Labels: auto-deprioritized-critical, auto-deprioritized-major > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song updated FLINK-13698: - Fix Version/s: (was: 1.14.0) 1.15.0 > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Minor > Labels: auto-deprioritized-critical, auto-deprioritized-major > Fix For: 1.15.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-13698: --- Labels: auto-deprioritized-critical auto-deprioritized-major (was: auto-deprioritized-critical stale-major) Priority: Minor (was: Major) This issue was labeled "stale-major" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Minor > Labels: auto-deprioritized-critical, auto-deprioritized-major > Fix For: 1.14.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-13698: --- Labels: auto-deprioritized-critical stale-major (was: auto-deprioritized-critical) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Major but is unassigned and neither itself nor its Sub-Tasks have been updated for 30 days. I have gone ahead and added a "stale-major" to the issue". If this ticket is a Major, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Major > Labels: auto-deprioritized-critical, stale-major > Fix For: 1.14.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-13698: --- Labels: auto-deprioritized-critical (was: stale-critical) Priority: Major (was: Critical) This issue was labeled "stale-critical" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Critical, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Major > Labels: auto-deprioritized-critical > Fix For: 1.14.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Flink Jira Bot updated FLINK-13698: --- Labels: stale-critical (was: ) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Critical but is unassigned and neither itself nor its Sub-Tasks have been updated for 7 days. I have gone ahead and marked it "stale-critical". If this ticket is critical, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Critical > Labels: stale-critical > Fix For: 1.14.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Wysakowicz updated FLINK-13698: - Fix Version/s: (was: 1.13.0) 1.14.0 > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Critical > Fix For: 1.14.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-13698: -- Priority: Critical (was: Major) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Priority: Critical > Fix For: 1.13.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-13698: --- Fix Version/s: (was: 1.12.0) 1.13.0 > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Major > Fix For: 1.13.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-13698: --- Fix Version/s: (was: 1.10.0) 1.11.0 > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Blocker > Fix For: 1.11.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-13698: --- Priority: Major (was: Blocker) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Major > Fix For: 1.11.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-13698: -- Priority: Blocker (was: Critical) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Blocker > Fix For: 1.10.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-13698: -- Fix Version/s: 1.10.0 > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Critical > Fix For: 1.10.0 > > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-13698) Rework threading model of CheckpointCoordinator
[ https://issues.apache.org/jira/browse/FLINK-13698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski updated FLINK-13698: --- Issue Type: Improvement (was: Bug) > Rework threading model of CheckpointCoordinator > --- > > Key: FLINK-13698 > URL: https://issues.apache.org/jira/browse/FLINK-13698 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing >Affects Versions: 1.10.0 >Reporter: Piotr Nowojski >Assignee: Biao Liu >Priority: Critical > > Currently {{CheckpointCoordinator}} and {{CheckpointFailureManager}} code is > executed by multiple different threads (mostly {{ioExecutor}}, but not only). > It's causing multiple concurrency issues, for example: > https://issues.apache.org/jira/browse/FLINK-13497 > Proper fix would be to rethink threading model there. At first glance it > doesn't seem that this code should be multi threaded, except of parts doing > the actual IO operations, so it should be possible to run everything in one > single ExecutionGraph's thread and just run asynchronously necessary IO > operations with some feedback loop ("mailbox style"). > I would strongly recommend fixing this issue before adding new features in > the \{{CheckpointCoordinator}} component. -- This message was sent by Atlassian JIRA (v7.6.14#76016)