## What is the purpose of the change

This pull request adds hooks to optionally inject entropy to the checkpoint 
path based upon a user defined pattern in the configuration for better S3 
scalability.

This is a revised version of #6302 by @indrc (based on Flink 1.4) that ports 
the changes to the new FileStateStorage code introduced in Flink 1.5.

## Brief change log

  - Adds an optional `CheckpointPathFilter` that is applied to data and 
metadata paths before creating the files
  - Adds an implementation of that filter that replaces entropy keys with 
entropy (data files) or removes the entropy key (metadata files)
  - Adds functionality to read the entropy pattern from the config or set it on 
the state backend, and configure the CheckpointPathFilter based on that

## Verifying this change

  - Added unit tests under `org.apache.flink.runtime.state.filesystem` 

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **no**
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
  - The S3 file system connector: **no**

## Documentation

  - Does this pull request introduce a new feature? **yes**
  - If yes, how is the feature documented? **docs**


[ Full content available at: https://github.com/apache/flink/pull/6604 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to