yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-501997307
@StefanRRichter I have fixed most of the issues. Please review again.
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-501568073
Hi @StefanRRichter thanks for your suggestion. I have fixed some issues.
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-501212069
Hi @StefanRRichter I have fixed the conflicts and rebased the code. When you
have time,
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-496903602
Hi @StefanRRichter I think I have fixed the existed issues. I have triggered
the Travis
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-496101546
Hi @StefanRRichter In the last few days, I did further refactor, include:
* remove
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-495915181
Hi @StefanRRichter I understand what you mean and have given a reply. I
think this PR has a
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-495483854
> @StefanRRichter After thinking seriously, my view of point has changed.
>
> There are
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-495481836
> About the problem with the SQL test, having the detailed logs from JM/TMs
would be helpful.
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-495081839
@StefanRRichter I happened to have an exception about Kafka's end-to-end
testing:
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-495066930
@klion26 Thanks for your review suggestion. What do you think about the new
change?
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-494781182
The essential difference between `failOnCheckpointingErrors` and
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-494394030
@StefanRRichter OK, if it is unrelated, we can ignore it. In addition, what
do you think
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493837160
Triggered again, still has SQL deadlock :
https://travis-ci.org/apache/flink/jobs/534620910
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493826613
@StefanRRichter After thinking seriously, my view of point has changed.
There are two
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493817814
retriggered once, result failure, reason: dead lock (sql) detail :
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493402537
For a clearer analysis, I plan to do three steps:
* Rebase master branch
* Remove
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493345060
@StefanRRichter So far, the test problems caused by this PR have been fixed,
but Travis has
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-493080226
@StefanRRichter Have updated. What do you think about the new commit and the
whole PR?
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-492885118
Now, it seems Travis's failure is not related to this PR.
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-492636770
Have fixed conflicts. In addition, I have refactored
`CheckpointCoordinator`'s constructor
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-492620693
@StefanRRichter My another PR #8410 which just merged caused many conflicts.
The reason is
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-492471326
Irrelevant test error, Have created an issue to report it, will trigger
rebuild~
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-491497620
@StefanRRichter Sorry, I still do not provide a final implementation of this
issue. It's a
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-490365680
@StefanRRichter Really sorry for the late reply, I just took a holiday and
attended QCon
yanghua commented on issue #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#issuecomment-487947096
@StefanRRichter This is the second step's PR of the checkpoint failure
process improvement.
25 matches
Mail list logo