Thank you for the clarification. On Thu, Jun 21, 2018 at 1:36 PM sihua zhou <summerle...@163.com> wrote:
> Yes, you can clear the state for a key(the currently active key), if you > clear it, it means that you have also cleaned it from the state backend, > and the future checpoints won't contains the key anymore unless you add it > again. > > Best, Sihua > > > On 06/21/2018 16:04,Garvit Sharma<garvit...@gmail.com> > <garvit...@gmail.com> wrote: > > Now, after clearing state for a key, I don't want that redundant data in > the state backend. This is my concern. > > Please let me know if there are any gaps. > > Thanks, > > On Thu, Jun 21, 2018 at 1:31 PM Garvit Sharma <garvit...@gmail.com> wrote: > >> I am maintaining state data for a key in ValueState. As per [0] I can >> clear() state for that key. >> >> [0] >> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/stream/state/state.html >> >> Please let me know. >> >> Thanks, >> >> >> On Thu, Jun 21, 2018 at 1:19 PM sihua zhou <summerle...@163.com> wrote: >> >>> Hi Garvit, >>> >>> Let's say you clearing the state at timestamp t1, then the checkpoints >>> completed before t1 will still contains the data you cleared. But the >>> future checkpoints won't contain the cleared data again. But I'm not sure >>> what you meaning by the cleared state, you can only clear a key-value pair >>> of the state currently, you can't cleared the whole state currently. >>> >>> Best, Sihua >>> >>> On 06/21/2018 15:41,Garvit Sharma<garvit...@gmail.com> >>> <garvit...@gmail.com> wrote: >>> >>> So, would it delete all the files in HDFS associated with the cleared >>> state? >>> >>> On Thu, Jun 21, 2018 at 12:58 PM sihua zhou <summerle...@163.com> wrote: >>> >>>> Hi Garvit, >>>> >>>> > Now, let's say, we clear the state. Would the state data be removed >>>> from HDFS too? >>>> >>>> The state data would not be removed from HDFS immediately, if you clear >>>> the state in your job. But after you clearing the state in your job, the >>>> later completed checkpoint won't contain the state any more. >>>> >>>> > How does Flink manage to clear the state data from state backend on >>>> clearing the keyed state? >>>> >>>> 1. you can use the {{tate.checkpoints.num-retained}} to set the number >>>> of the completed checkpoint maintanced on HDFS. >>>> 2. If you set {{ >>>> env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup. >>>> DELETE_ON_CANCELLATION)}} then the checkpoints on HDFS will be removed >>>> once your job is finished(or cancled). And if you set {{ >>>> env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup. >>>> RETAIN_ON_CANCELLATION)}} then the checkpoints will be remained. >>>> >>>> Please refer to >>>> https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/checkpoints.html >>>> to >>>> find more information. >>>> >>>> >>>> Additional, I'd like to give a bref info of the checkpoint on HDFS. In >>>> a nutshell, what ever you did with the state in your running job, they only >>>> effect the content on the state backend locally. When checkpointing, flink >>>> takes a snapshot of the local state backend, and send it to the checkpoint >>>> target directory(in your case, the HDFS). The checkpoints on the HDFS looks >>>> like the periodic snapshot of the state backend of your job, they can be >>>> created or deleted but never be changed. Maybe Stefan(cc) could give you >>>> more professional information and plz correct me if I'm incorrect. >>>> >>>> Best, Sihua >>>> On 06/21/2018 14:40,Garvit Sharma<garvit...@gmail.com> >>>> <garvit...@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> Consider a managed keyed state backed by HDFS with checkpointing >>>> enabled. Now, as the state grows the state data will be saved on HDFS. >>>> >>>> Now, let's say, we clear the state. Would the state data be removed >>>> from HDFS too? >>>> >>>> How does Flink manage to clear the state data from state backend on >>>> clearing the keyed state? >>>> >>>> -- >>>> >>>> Garvit Sharma >>>> github.com/garvitlnmiit/ >>>> >>>> No Body is a Scholar by birth, its only hard work and strong >>>> determination that makes him master. >>>> >>>> >>> >>> -- >>> >>> Garvit Sharma >>> github.com/garvitlnmiit/ >>> >>> No Body is a Scholar by birth, its only hard work and strong >>> determination that makes him master. >>> >>> >> >> -- >> >> Garvit Sharma >> github.com/garvitlnmiit/ >> >> No Body is a Scholar by birth, its only hard work and strong >> determination that makes him master. >> > > > -- > > Garvit Sharma > github.com/garvitlnmiit/ > > No Body is a Scholar by birth, its only hard work and strong determination > that makes him master. > > -- Garvit Sharma github.com/garvitlnmiit/ No Body is a Scholar by birth, its only hard work and strong determination that makes him master.