So, would it delete all the files in HDFS associated with the cleared state?
On Thu, Jun 21, 2018 at 12:58 PM sihua zhou <summerle...@163.com> wrote: > Hi Garvit, > > > Now, let's say, we clear the state. Would the state data be removed from > HDFS too? > > The state data would not be removed from HDFS immediately, if you clear > the state in your job. But after you clearing the state in your job, the > later completed checkpoint won't contain the state any more. > > > How does Flink manage to clear the state data from state backend on > clearing the keyed state? > > 1. you can use the {{tate.checkpoints.num-retained}} to set the number of > the completed checkpoint maintanced on HDFS. > 2. If you set {{ > env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup. > DELETE_ON_CANCELLATION)}} then the checkpoints on HDFS will be removed > once your job is finished(or cancled). And if you set {{ > env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup. > RETAIN_ON_CANCELLATION)}} then the checkpoints will be remained. > > Please refer to > https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html > to > find more information. > > > Additional, I'd like to give a bref info of the checkpoint on HDFS. In a > nutshell, what ever you did with the state in your running job, they only > effect the content on the state backend locally. When checkpointing, flink > takes a snapshot of the local state backend, and send it to the checkpoint > target directory(in your case, the HDFS). The checkpoints on the HDFS looks > like the periodic snapshot of the state backend of your job, they can be > created or deleted but never be changed. Maybe Stefan(cc) could give you > more professional information and plz correct me if I'm incorrect. > > Best, Sihua > On 06/21/2018 14:40,Garvit Sharma<garvit...@gmail.com> > <garvit...@gmail.com> wrote: > > Hi, > > Consider a managed keyed state backed by HDFS with checkpointing enabled. > Now, as the state grows the state data will be saved on HDFS. > > Now, let's say, we clear the state. Would the state data be removed from > HDFS too? > > How does Flink manage to clear the state data from state backend on > clearing the keyed state? > > -- > > Garvit Sharma > github.com/garvitlnmiit/ > > No Body is a Scholar by birth, its only hard work and strong determination > that makes him master. > > -- Garvit Sharma github.com/garvitlnmiit/ No Body is a Scholar by birth, its only hard work and strong determination that makes him master.