Thanks for the response David. I'm using Flink 1.13.5. >> For point 1 the behavior you are seeing is what is expected.
Great. That's what I concluded after digging into things a little more. This helps me be sure I just didn't miss some other configuration. Thank you. >> For point 2, I'm not sure. Ok, It appears to be the path to the file named "metadata" >> FWIW, I would urge you to use presto instead of hadoop for checkpointing on >> S3. The performance of the hadoop "filesystem" is problematic when it's used >> for checkpointing. For sure, it's definitely on the list. On Thu, May 19, 2022 at 7:06 AM David Anderson <dander...@apache.org> wrote: > > Aeden, > > I want to expand my answer after having re-read your question a bit more > carefully. > > For point 1 the behavior you are seeing is what is expected. With hadoop the > metadata written by the job manager will literally include "_entropy_" in its > path, while this will be replaced in paths of any and all checkpoint data > files. With presto the metadata path won't include "_entropy_" at all (it > will disappear, rather than being replaced by something specific). > > For point 2, I'm not sure. > > David > > On Thu, May 19, 2022 at 2:37 PM David Anderson <da...@nosredna.org> wrote: >> >> This sounds like it could be FLINK-17359 [1]. What version of Flink are you >> using? >> >> Another likely explanation arises from the fact that only the checkpoint >> data files (the ones created and written by the task managers) will have the >> _entropy_ replaced. The job manager does not inject entropy into the path of >> the checkpoint metadata, so that it remains at a predictable URI. Since >> Flink only writes keyed state larger than state.storage.fs.memory-threshold >> into the checkpoint data files, and only those files have entropy injected >> into their paths, if all of your state is small it will all end up in the >> metadata file and you don't see any entropy injection happening. See the >> comments on [2] for more on this. >> >> FWIW, I would urge you to use presto instead of hadoop for checkpointing on >> S3. The performance of the hadoop "filesystem" is problematic when it's used >> for checkpointing. >> >> Regards,, >> David >> >> [1] https://issues.apache.org/jira/browse/FLINK-17359 >> [2] https://issues.apache.org/jira/browse/FLINK-24878 >> >> On Wed, May 18, 2022 at 7:48 PM Aeden Jameson <aeden.jame...@gmail.com> >> wrote: >>> >>> I have checkpoints setup against s3 using the hadoop plugin. (I'll >>> migrate to presto at some point) I've setup entropy injection per the >>> documentation with >>> >>> state.checkpoints.dir: s3://my-bucket/_entropy_/my-job/checkpoints >>> s3.entropy.key: _entropy_ >>> >>> I'm seeing some behavior that I don't quite understand. >>> >>> 1. The folder s3://my-bucket/_entropy_/my-job/checkpoints/... >>> literally exists. Meaning that "_entropy_" has not been replaced. At >>> the same time there are also a bunch of folders where "_entropy_" has >>> been replaced. Is that to be expected? If so, would someone elaborate >>> on why this is happening? >>> >>> 2. Should the paths in the checkpoints history tab in the FlinkUI >>> display the path the key? With the current setup it is not. >>> >>> Thanks, >>> Aeden >>> >>> GitHub: https://github.com/aedenj >>> Linked In: http://www.linkedin.com/in/aedenjameson -- Cheers, Aeden GitHub: https://github.com/aedenj Linked In: http://www.linkedin.com/in/aedenjameson