Thanks for the response David. I'm using Flink 1.13.5.

>> For point 1 the behavior you are seeing is what is expected.

Great. That's what I concluded after digging into things a little
more. This helps me be sure I just didn't miss some other
configuration. Thank you.

>> For point 2, I'm not sure.

Ok, It appears to be the path to the file named "metadata"

>> FWIW, I would urge you to use presto instead of hadoop for checkpointing on 
>> S3. The performance of the hadoop "filesystem" is problematic when it's used 
>> for checkpointing.

For sure, it's definitely on the list.

On Thu, May 19, 2022 at 7:06 AM David Anderson <dander...@apache.org> wrote:
>
> Aeden,
>
> I want to expand my answer after having re-read your question a bit more 
> carefully.
>
> For point 1 the behavior you are seeing is what is expected. With hadoop the 
> metadata written by the job manager will literally include "_entropy_" in its 
> path, while this will be replaced in paths of any and all checkpoint data 
> files. With presto the metadata path won't include "_entropy_" at all (it 
> will disappear, rather than being replaced by something specific).
>
> For point 2, I'm not sure.
>
> David
>
> On Thu, May 19, 2022 at 2:37 PM David Anderson <da...@nosredna.org> wrote:
>>
>> This sounds like it could be FLINK-17359 [1]. What version of Flink are you 
>> using?
>>
>> Another likely explanation arises from the fact that only the checkpoint 
>> data files (the ones created and written by the task managers) will have the 
>> _entropy_ replaced. The job manager does not inject entropy into the path of 
>> the checkpoint metadata, so that it remains at a predictable URI. Since 
>> Flink only writes keyed state larger than state.storage.fs.memory-threshold 
>> into the checkpoint data files, and only those files have entropy injected 
>> into their paths, if all of your state is small it will all end up in the 
>> metadata file and you don't see any entropy injection happening. See the 
>> comments on [2] for more on this.
>>
>> FWIW, I would urge you to use presto instead of hadoop for checkpointing on 
>> S3. The performance of the hadoop "filesystem" is problematic when it's used 
>> for checkpointing.
>>
>> Regards,,
>> David
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-17359
>> [2] https://issues.apache.org/jira/browse/FLINK-24878
>>
>> On Wed, May 18, 2022 at 7:48 PM Aeden Jameson <aeden.jame...@gmail.com> 
>> wrote:
>>>
>>> I have checkpoints setup against s3 using the hadoop plugin. (I'll
>>> migrate to presto at some point) I've setup entropy injection per the
>>> documentation with
>>>
>>> state.checkpoints.dir: s3://my-bucket/_entropy_/my-job/checkpoints
>>> s3.entropy.key: _entropy_
>>>
>>> I'm seeing some behavior that I don't quite understand.
>>>
>>> 1. The folder s3://my-bucket/_entropy_/my-job/checkpoints/...
>>> literally exists. Meaning that "_entropy_" has not been replaced. At
>>> the same time there are also a bunch of folders where "_entropy_" has
>>> been replaced. Is that to be expected? If so, would someone elaborate
>>> on why this is happening?
>>>
>>> 2. Should the paths in the checkpoints history tab in the FlinkUI
>>> display the path the key? With the current setup it is not.
>>>
>>> Thanks,
>>> Aeden
>>>
>>> GitHub: https://github.com/aedenj
>>> Linked In: http://www.linkedin.com/in/aedenjameson



-- 
Cheers,
Aeden

GitHub: https://github.com/aedenj
Linked In: http://www.linkedin.com/in/aedenjameson

Reply via email to