Re: Flink not restoring from checkpoint when job manager fails even with HA through zookeeper

2020-06-06 Thread Vijay Bhaskar
Job is "FAILED" state and hence Flink HA Removed the job graph from zookeeper along with the state. One thing is we can't completely rely on Flink HA for state restoring. It will only until Job hasn't FAILED If you want to recover Job even after Failure, you should do the following: a) Use the

Re: [DISCUSS] (Document) Backwards Compatibility of Savepoints

2020-06-06 Thread Steven Wu
> Why do we want to restore from the savepoint taken the new Flink version instead of the previous savepoint, is that we want to minimize the source rewind? You are exactly right. E.g. A user upgraded to the new version for a few days and decided to roll back to the old version due to some

Re: Flink not restoring from checkpoint when job manager fails even with HA through zookeeper

2020-06-06 Thread Teng Fei Liao
It seems like the JobManager is treating this as a job failure. A FAILED JobStatus is a globally terminal state so everything gets deleted with zookeeper HA. https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/JobStatus.java#L39

Flink not restoring from checkpoint when job manager fails even with HA through zookeeper

2020-06-06 Thread Kathula, Sandeep
Hi, We are running Flink 1.9 in K8S. We used https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/jobmanager_high_availability.html to set high availability. We have a single master. We set max number of retries for a task to 2. After task fails twice and then the job manager

[jira] [Created] (FLINK-18167) Flink Job hangs there when one vertex is failed and another is cancelled.

2020-06-06 Thread Jeff Zhang (Jira)
Jeff Zhang created FLINK-18167: -- Summary: Flink Job hangs there when one vertex is failed and another is cancelled. Key: FLINK-18167 URL: https://issues.apache.org/jira/browse/FLINK-18167 Project:

[jira] [Created] (FLINK-18166) JAVA_HOME is not read from .bashrc when start flink

2020-06-06 Thread appleyuchi (Jira)
appleyuchi created FLINK-18166: -- Summary: JAVA_HOME is not read from .bashrc when start flink Key: FLINK-18166 URL: https://issues.apache.org/jira/browse/FLINK-18166 Project: Flink Issue Type:

[jira] [Created] (FLINK-18165) When savingpoint is restored, select the checkpoint directory and stateBackend

2020-06-06 Thread Xinyuan Liu (Jira)
Xinyuan Liu created FLINK-18165: --- Summary: When savingpoint is restored, select the checkpoint directory and stateBackend Key: FLINK-18165 URL: https://issues.apache.org/jira/browse/FLINK-18165

Re: [VOTE] Apache Flink Stateful Functions 2.1.0, release candidate #1

2020-06-06 Thread Matt Wang
+1 (non-binding) - signatures & hash, ok - mvn clean install -Prun-e2e-tests on 1.8.0_77, ok - source archives do not contains any binaries, ok - version of POM files and Dockerfiles are correct, ok --- Best, Matt Wang On 06/5/2020 16:58,Robert Metzger wrote: Thanks a lot for creating this

Re: Interest registration for Documentation task relating Table API & SQL

2020-06-06 Thread Marta Paes Moreira
Hi, Abdul. Thank you for reaching out and for your interest in Google Season of Docs (GSoD)! We really appreciate it that you took the time to walk us through your experience, as well as your previous contributions to open source. Even though we are accepting applications from people with any