MartijnVisser opened a new pull request, #28286: URL: https://github.com/apache/flink/pull/28286
## What is the purpose of the change `flink-gs-fs-hadoop` bundled `google-cloud-storage` 2.29.1, which throws a `NullPointerException` instead of retrying certain GCS `503 Service Unavailable` errors during resumable uploads. This breaks checkpointing for jobs that write to `gs://` through a `RecoverableWriter` (e.g. the `FileSink`). The upstream fix is in [googleapis/java-storage#2987](https://github.com/googleapis/java-storage/pull/2987), which is included in newer releases of the library. This PR takes over the stale #27679 (thanks @jonchase) and completes it by also regenerating the bundled-dependency `NOTICE`, which is required because this module shades all of its dependencies into the plugin jar. ## Brief change log - Bump `google-cloud-storage` 2.29.1 -> 2.68.0 (latest stable) and the matching grpc artifacts 1.59.1 -> 1.81.0 in `flink-gs-fs-hadoop/pom.xml`. - Regenerate `META-INF/NOTICE` to match the bundled dependency set produced by the shade plugin (verified to match exactly via `tools/ci/license_check.sh`). - Add the bundled license file for the newly bundled `org.codehaus.woodstox:stax2-api`. - Update the `google-cloud-storage` version link in the GCS filesystem docs (English + Chinese). ## Verifying this change This change is covered by the existing `flink-gs-fs-hadoop` tests (236 tests pass, including `GSRecoverableWriterTest`, the committer/serializer tests, and the `LocalStorageHelper`-based `GSBlobStorageImplTest`, confirming the pinned `google-cloud-nio` test dependency stays compatible). In addition, the license/NOTICE was validated locally with `tools/ci/license_check.sh` (no severe issues; the `NOTICE` matches the 98 bundled dependencies exactly) and the `dependency-convergence` and `ban-unsafe-jackson` enforcers pass under `-Pcheck-convergence`. Finally, the upgraded SDK was manually verified against a real GCS bucket: a `RecoverableWriter` wrote a multi-chunk object, took a checkpoint via `persist()`, recovered from that checkpoint via `recover()`, committed, and the committed object was read back and asserted byte-for-byte (exactly-once), with no NPE/503 failure. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **yes** (upgrades `google-cloud-storage` and grpc, with corresponding `NOTICE`/license updates) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: **yes** (improves reliability of GCS `RecoverableWriter` checkpoint/recovery by fixing 503 retry handling) - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes (please specify the tool below) Generated-by: Claude Code (Opus 4.8) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
