Hi Parag, I am not so familiar with the setup you are using, but did you check out [1]? Maybe the parameter [--fromSavepoint /path/to/savepoint [--allowNonRestoredState]] is what you are looking for?
Best regards, Nico [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/docker/#application-mode-on-docker On Tue, Oct 5, 2021 at 12:37 PM Parag Somani <somanipa...@gmail.com> wrote: > Hello, > > We are currently using Apache flink 1.12.0 deployed on k8s cluster of 1.18 > with zk for HA. Due to certain vulnerabilities in container related with > few jar(like netty-*, meso), we are forced to upgrade. > > While upgrading flink to 1.14.0, faced NPE, > https://issues.apache.org/jira/browse/FLINK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17402570#comment-17402570 > > To address it, I have followed steps > > 1. savepoint creation > 2. Stop the job > 3. Restore from save point where i am facing challenge. > > For step #3 from above, i was able to restore from savepoint mainly > because: > "bin/flink run -s :savepointPath [:runArgs] " > It majorly about restarting a jar file uploaded. As our application is > based on k8s and running using docker, i was not able to restore it. And > because of it, state of variables in accumulator got corrupted and i lost > the data in one of env. > > My query is, what is preffered way to restore from savepoint, if > application is running on k8s using docker. > > We are using following command to run job manager: > /docker-entrypoint.sh "standalone-job" "-Ds3.access-key= > ${AWS_ACCESS_KEY_ID}" "-Ds3.secret-key=${AWS_SECRET_ACCESS_KEY}" > "-Ds3.endpoint=${AWS_S3_ENDPOINT}" "-Dhigh-availability.zookeeper.quorum= > ${ZOOKEEPER_CLUSTER}" "--job-classname" "<class-name>" ${args} > > Thank you in advance...! > > -- > Regards, > Parag Surajmal Somani. >