cnauroth opened a new pull request, #26102:
URL: https://github.com/apache/flink/pull/26102
## What is the purpose of the change
Increase test coverage for GCS integration by providing integration tests
implemented from the common flink-hadoop-fs test suites. These optional tests
run only if the environment is configured for access to GCS.
## Brief change log
* Create JUnit extension `RequireGCSConfiguration`. The extension checks for
environment variables `GOOGLE_APPLICATION_CREDENTIALS` and `GCS_BASE_PATH`. If
these are not defined, then the tests are skipped.
* Create GCS subclasses of the suites defined in flink-hadoop-fs.
* Two tests needed to be overridden in the subclasses and skipped. One isn't
meaningful for GCS and triggers a false failure. The other is going to take a
lot more intrusive changes to get working on GCS because of an assumption that
state is held in exactly one file, which isn't true for the GCS implementation.
* Allow `RecoverableWriter` tests to return `null` from `getLocalTmpDir()`
to indicate no allocation/release of local storage is required.
* A new ArchUnit exception is added, because these integration tests can't
meaningfully use the mini-cluster extensions. This is similar to how it's
handled for other file systems.
## Verifying this change
This change added tests and can be verified as follows:
New integration tests are skipped when run without setting the
configuration, so there is no impact to existing developer workflows:
```
mvn -Pfast -pl flink-filesystems/flink-gs-fs-hadoop clean verify
...
[INFO] Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
...
```
With configuration in place, the tests pass (with the exception of the two
that are skipped):
```
export GOOGLE_APPLICATION_CREDENTIALS=/tmp/credentials.json
export GCS_BASE_PATH=gs://<BUCKET>/flink-tests
mvn -Pfast -pl flink-filesystems/flink-gs-fs-hadoop clean verify
...
[WARNING] Tests run: 35, Failures: 0, Errors: 0, Skipped: 2
...
```
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: no
- The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]