Rishabh Daim created OAK-12241:
----------------------------------

             Summary: run S3/GCP integration tests with S3Mock emulator in CI
                 Key: OAK-12241
                 URL: https://issues.apache.org/jira/browse/OAK-12241
             Project: Jackrabbit Oak
          Issue Type: Task
          Components: blob-cloud, blob-cloud-gcp
            Reporter: Rishabh Daim
            Assignee: Rishabh Daim


h2. Summary

Integration tests in \{{oak-blob-cloud}} require real AWS or GCP credentials 
(\{{aws.properties}}) and are skipped in Apache CI because secrets cannot be 
committed. This issue adds a local S3 emulator (Adobe S3Mock via 
Testcontainers) so existing integration tests run in CI without cloud 
credentials, in both *S3* and *GCP* storage modes.

h2. Background

* Most \{{oak-blob-cloud}} integration tests gate on 
\{{S3DataStoreUtils.isS3Configured()}} and skip when 
\{{accessKey}}/\{{secretKey}} are empty.
* Azure blob tests already use \{{AzuriteDockerRule}} + Testcontainers 
(\{{oak-blob-cloud-azure}}) — same pattern, no secrets in CI.
* AWS does not provide an official S3 emulator (unlike Azurite for Azure).
* Oak *GCP mode* uses the AWS S3 SDK against GCS S3 interoperability — not the 
native GCS client — so an S3-compatible emulator covers both modes with 
different config.
* Detailed implementation plan: 
\{{oak-blob-cloud/docs/S3-GCP-EMULATOR-TESTS-PLAN.md}}

h2. Proposed solution

# Add Adobe S3Mock 5.x (\{{com.adobe.testing:s3mock-testcontainers}}) + 
Testcontainers to \{{oak-blob-cloud}}.
# Introduce \{{S3MockRule}} and \{{S3EmulatorSupport}} (mirror 
\{{AzuriteDockerRule}}): lazy-start Docker container, skip tests if Docker 
unavailable.
# Wire emulator config into \{{S3DataStoreUtils.getS3Config()}}:
#* Real credentials in \{{aws.properties}} / \{{-Ds3.config}} take precedence 
(manual cloud testing unchanged).
#* Otherwise use emulator properties when S3Mock is available.
# Run existing test classes in *both* modes via two Maven Surefire executions 
(\{{-Ds3.test.mode=S3}} and \{{-Ds3.test.mode=GCP}}) — no duplicate test 
classes.
# Small production change: opt-in \{{pathStyleAccess}} property (default 
\{{false}}) for S3Mock path-style URLs; GCP mode already enables path-style.
# Test-only fix: support \{{http://}} presigned URLs in 
\{{S3DataStoreUtils.getHttpsConnection()}} for direct-access tests.

h2. Test tiers (after change)

|| Tier || When || Purpose ||
| Unit | Always | Mock/no-network tests (unchanged) |
| Emulator | CI / local Docker, no creds | S3Mock — S3 and GCP mode |
| Real cloud | Populated \{{aws.properties}} | Optional manual validation |

h2. Scope

*In scope:*
* \{{TestS3Ds}} and subclasses (cache, small cache)
* \{{TestS3DataStore}}
* \{{S3DataRecordAccessProviderTest}} / \{{S3DataRecordAccessProviderIT}}
* \{{S3DataStoreServiceTest}}
* SSE subclasses: triage; KMS-with-key / SSE-C remain cloud-only via existing 
skip logic

*Out of scope:*
* \{{oak-it}}, \{{oak-jcr}}, \{{oak-upgrade}} S3 tests
* Migrating \{{oak-segment-aws}} findify S3Mock to Adobe S3Mock
* fake-gcs-server / native GCS client testing

h2. Acceptance criteria

* \{{mvn test -pl oak-blob-cloud}} passes with Docker and *no* cloud 
credentials.
* Integration tests that currently skip run under the emulator (S3 mode).
* Same tests run again under GCP mode (second Surefire execution).
* Populated real \{{aws.properties}} still takes precedence.
* No secrets in repo or CI.
* Production behavior unchanged unless \{{pathStyleAccess=true}} is explicitly 
set.
* Unit tests pass without Docker.

h2. References

* Azure precedent: \{{oak-blob-cloud-azure/.../AzuriteDockerRule.java}}
* S3Mock: https://github.com/adobe/S3Mock
* Plan doc: \{{oak-blob-cloud/docs/S3-GCP-EMULATOR-TESTS-PLAN.md}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to