[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id

2021-04-16 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17323277#comment-17323277
 ] 

Flink Jira Bot commented on FLINK-12514:


This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Refactor the failure checkpoint counting mechanism with ordered checkpoint id
> -
>
> Key: FLINK-12514
> URL: https://issues.apache.org/jira/browse/FLINK-12514
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available, stale-assigned
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the checkpoint failure manager uses a simple counting mechanism 
> which does not tract checkpoint id sequence.
> However, a more graceful counting mechanism is based on ordered checkpoint id 
> sequence.
> It should be refactored after the FLINK-12364 would been merged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id

2019-08-16 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909089#comment-16909089
 ] 

vinoyang commented on FLINK-12514:
--

[~pnowojski] I will give a more detailed description. I said this is a 
refactor, because I have implemented a simple counting mechanism based on 
{{AtomicInteger}}. The context of this idea comes from the PR of FLINK-12364, 
Stefan proposed it.

Whatever, I totally agree with your comment. And rework the title and 
description.

> Refactor the failure checkpoint counting mechanism with ordered checkpoint id
> -
>
> Key: FLINK-12514
> URL: https://issues.apache.org/jira/browse/FLINK-12514
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Affects Versions: 1.9.0
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the checkpoint failure manager uses a simple counting mechanism 
> which does not tract checkpoint id sequence.
> However, a more graceful counting mechanism is based on ordered checkpoint id 
> sequence.
> It should be refactored after the FLINK-12364 would been merged.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id

2019-08-16 Thread Piotr Nowojski (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909072#comment-16909072
 ] 

Piotr Nowojski commented on FLINK-12514:


I've briefly checked your PR and it's adding even more locking. I think 
whatever we do about this feature, it has to be done after refactoring 
FLINK-13698 and fixing bugs like FLINK-13497 caused by  FLINK-12364.

Can you also add better description in this ticket, what this change is about 
(what is it trying to fix/improve)? Also I think the title is misleading, as 
this is not a refactor, but a new feature.

> Refactor the failure checkpoint counting mechanism with ordered checkpoint id
> -
>
> Key: FLINK-12514
> URL: https://issues.apache.org/jira/browse/FLINK-12514
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the checkpoint failure manager uses a simple counting mechanism 
> which does not tract checkpoint id sequence.
> However, a more graceful counting mechanism is based on ordered checkpoint id 
> sequence.
> It should be refactored after the FLINK-12364 would been merged.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (FLINK-12514) Refactor the failure checkpoint counting mechanism with ordered checkpoint id

2019-06-19 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867498#comment-16867498
 ] 

vinoyang commented on FLINK-12514:
--

Hi [~pnowojski] , in the last few days [~srichter] as my mentor to review the 
PR of FLINK-12364. Now it is been merged. Thanks for [~srichter]'s efforts!

In that PR, we discussed and agreed that we need a more reasonable mechanism of 
counting for concurrent failed checkpoint ids. We need to consider the 
checkpoint id sequence.

I sincerely invite you as my mentor about this issue. WDYT? cc [~till.rohrmann]

> Refactor the failure checkpoint counting mechanism with ordered checkpoint id
> -
>
> Key: FLINK-12514
> URL: https://issues.apache.org/jira/browse/FLINK-12514
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Checkpointing
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Major
>
> Currently, the checkpoint failure manager uses a simple counting mechanism 
> which does not tract checkpoint id sequence.
> However, a more graceful counting mechanism is based on ordered checkpoint id 
> sequence.
> It should be refactored after the FLINK-12364 would been merged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)