seropian commented on PR #2815:
URL: https://github.com/apache/jackrabbit-oak/pull/2815#issuecomment-4214105686
# OAK-12164: Azure V8/V12 SDK Isolation — Summary
| | |
|---|---|
| **Root Cause** | OAK-11267 (V8→V12 migration) changed V8 upload constants
— `MAX_MULTIPART_UPLOAD_PART_SIZE` jumped from 100 MiB to 4000 MiB, causing DAM
archive OOM in production |
| **Fix Strategy** | Structurally isolate V8 and V12 code into separate
subpackages so constants, types, and behavior can never cross-contaminate |
| **Scope** | `oak-blob-cloud-azure` + consumers (`oak-run-commons`,
`oak-jcr`, `oak-run`) |
| **Impact** | 46 files changed, ~33 new test infrastructure files |
---
## What Changed and Why
### BLOCKER — Root Cause Fix + Structural Isolation (Phases 1–3)
V8 and V12 each own their constants in self-contained classes
(`AzureConstantsV8` / `AzureConstantsV12`). The shared `AzureConstants` retains
only config property name strings and the V12 toggle flag — no runtime
constants. **Key value restored:** V8 `MAX_MULTIPART_UPLOAD_PART_SIZE` = 100
MiB (was incorrectly 4000 MiB).
All V12 code moved into `blobstorage/v12/` with `V12` suffix. V8 code lives
in `blobstorage/v8/`. The parent `blobstorage/` package contains only
SDK-neutral shared code. V8 backend files are rebuilt from the known-good
1.90.0 source — only structural changes allowed (package/class rename,
visibility widening). **One behavioral change permitted:** proxy bug fix in
`UtilsV8.setProxyIfNeeded()`.
```
src/main/java/.../blobstorage/ ← SDK-neutral (no Azure
SDK imports)
├── AbstractAzureBlobStoreBackend.java shared base class
for V8/V12 backends
├── AbstractAzureDataStoreService.java OSGi DS base
├── AzureBlobContainer.java SDK-neutral
container interface
├── AzureBlobContainers.java factory (routes to
v8/v12)
├── AzureConstants.java config keys + toggle
only
├── AzureDataStore.java dispatches via
AzureSdkVersion
├── AzureDataStoreService.java OSGi DS component
├── AzureSdkVersion.java enum {V8,V12} — sole
selection point
│
├── v8/ ← com.microsoft.azure.*
only
│ ├── AzureBlobContainerProviderV8.java V8 container client
builder
│ ├── AzureBlobContainerV8.java AzureBlobContainer →
CloudBlobContainer
│ ├── AzureBlobStoreBackendV8.java rebuilt from 1.90.0
source
│ ├── AzureConstantsV8.java 1.90.0 values,
self-contained
│ └── UtilsV8.java V8 config reading +
proxy
│
└── v12/ ← com.azure.* only
├── AzureBlobContainerProviderV12.java V12 container
client builder
├── AzureBlobContainerV12.java AzureBlobContainer
→ BlobContainerClient
├── AzureBlobStoreBackendV12.java V12 backend
├── AzureConstantsV12.java V12 values,
self-contained
├── AzureHttpRequestLoggingPolicyV12.java request logging
policy
├── BlobSasHeadersV12.java SAS header wrapper
└── UtilsV12.java V12 config reading
+ proxy
src/test/java/.../blobstorage/ ← shared / cross-SDK
tests
├── AzureBlobStoreBackendCompatibilityTest.java unit:
direct-download/upload cache config setters
├── AzureConstantsTest.java unit: V8+V12
constant value guards, cross-contamination checks
├── AzureDataStoreIT.java Azurite IT:
parameterized {V8,V12} × 60 DataStore tests
├── AzureDataStoreUtils.java helper: Azure
config loading, container cleanup
├── AzureDataStoreVersionSelectionIT.java Azurite IT: SDK
selection via property/sysprop, startup logging
├── AzureIsolationTest.java unit: source-scan
enforcing v8/v12 import boundaries
├── AzuriteDockerRule.java infra: JUnit rule
managing Azurite Docker container lifecycle
│
├── v8/ ← V8 tests
│ ├── AzureBlobContainerProviderV8AuthenticationTest.java unit: auth
method priority (key, SAS, service principal)
│ ├── AzureBlobContainerProviderV8BuilderTest.java unit:
builder pattern, property initialization
│ ├── AzureBlobContainerProviderV8ComprehensiveTest.java unit: token
refresh with expiring tokens
│ ├── AzureBlobContainerProviderV8ContainerOperationsIT.java Azurite IT:
create/delete/exists container ops
│ ├── AzureBlobContainerProviderV8ErrorConditionsTest.java unit: error
handling edge cases
│ ├── AzureBlobContainerProviderV8HeaderManagementTest.java unit:
SharedAccessBlobHeaders handling
│ ├── AzureBlobContainerProviderV8IT.java Azurite IT:
end-to-end V8 provider tests
│ ├── AzureBlobContainerProviderV8SasGenerationTest.java unit: SAS
token generation and signatures
│ ├── AzureBlobContainerProviderV8TokenManagementTest.java unit: token
refresh, validation, lifecycle
│ ├── AzureBlobStoreBackendV8IT.java Azurite IT:
backend ops + CSO reproduction tests
│ └── UtilsV8Test.java unit:
connection string construction
│
└── v12/ ← V12 tests
├── AzureBlobContainerProviderV12IT.java Azurite IT:
V12 builder and auth flows
├── AzureBlobStoreBackendV12IT.java Azurite IT:
backend ops + CSO prevention tests
├── AzureDataRecordAccessProviderCDNV12Test.java live-Azure:
CDN domain override for direct access
├── AzureDataRecordAccessProviderV12IT.java live-Azure:
direct access with large blobs
├── AzureDataRecordAccessProviderV12Test.java live-Azure:
presigned URL generation and access
├── AzureHttpRequestLoggingPolicyV12Test.java unit: HTTP
request/response logging
├── TestAzureDSV12.java Azurite IT:
DataStore with local cache on
├── TestAzureDSWithSmallCacheV12.java Azurite IT:
DataStore with small cache
├── TestAzureDsCacheOffV12.java Azurite IT:
DataStore with cache disabled
└── UtilsV12Test.java unit:
connection string and config utilities
```
### CRITICAL — SDK Selection + Consumer Decoupling + Enforcement (Phases 4–6)
`AzureSdkVersion.resolve(Properties)` is the **only** place that interprets
`blob.azure.v12.enabled`. All callers (DataStore, factory, tests) dispatch
through it.
`AzureBlobContainer` is an SDK-neutral interface. Consumer modules
(`oak-run-commons`, `oak-jcr`, `oak-run`) now use
`AzureBlobContainers.create(props)` instead of raw `CloudBlobContainer` /
`BlobContainerClient` types. Zero Azure SDK storage imports in consumer code.
Isolation rules enforced at build time by `AzureIsolationTest`
(source-scanning unit test):
- `v8/` must not import `com.azure.storage.*` or reference
`blobstorage.v12.*`
- `v12/` must not import `com.microsoft.azure.*` or reference
`blobstorage.v8.*`
- Parent `blobstorage/` must not import any SDK storage types
### MAJOR — Tests, Consumer Migration, Dual Infrastructure (Phases 7–11)
V12 tests relocated to `v12/` package. CSO prevention tests added (constant
guards, behavioral upload tests, cross-contamination checks). Consumer modules
migrated to `AzureBlobContainer`. `AzureDataStoreAzuriteIT` parameterized over
`{V8, V12}` — 60 tests x 2 SDKs = 120 tests. Dual test infrastructure (abstract
base + `*AzureTest` / `*AzuriteIT` variants) for all integration tests.
### V12 Bug Fixes Discovered During Implementation
| Bug | Fix |
|-----|-----|
| `blockSize=0` crash on empty files | `Math.max(1, Math.min(len, MAX))` |
| `getMetadataRecord` missing null/empty validation | Added
`IllegalArgumentException` |
| `uploadBlob` wrong exception chain | Unwrap `UncheckedIOException` |
| `getAllIdentifiers` NPE on blobs without dashes | `.filter()` before
`.map()` |
| Service principal not cached (new client per call) | Double-checked
locking cache |
| `MIN_MULTIPART_UPLOAD_PART_SIZE` = 256 KiB (too small) | Corrected to 4
MiB |
---
## Test Strategy
### CSO Reproduction Test
`AzureBlobStoreBackendV8IT.testV8InitiateHttpUpload_1GiB_MaxPart100MiB` is
the direct reproduction of the OOM incident. It calls `initiateHttpUpload()`
with a 1 GiB file on the V8 backend and asserts that the upload is split into
multiple 100 MiB parts. Before this fix, V8 used V12's 4000 MiB constant, so
the same call would request a single 1 GiB part — the allocation pattern that
caused the production OOM. A symmetric test exists on V12
(`AzureBlobStoreBackendV12IT`) asserting its own correct part sizes.
A 40 GiB variant (`testV8InitiateHttpUpload_40GiB_UnlimitedURIs` /
`testV12InitiateHttpUpload_40GiB_CorrectURIs`) exercises the realistic DAM
archive size that triggered the production incident. V8 asserts `ceil(40 GiB /
10 MiB) = 4096` parts; V12 asserts `ceil(40 GiB / 4 MiB) = 10240` parts. Both
confirm the upload is split into thousands of manageable parts rather than a
handful of multi-GiB allocations.
### CSO Prevention (4 layers)
| Layer | What it catches |
|-------|-----------------|
| Constant value guards | Wrong values assigned to constants (11 assertions)
|
| Behavioral upload tests | `initiateHttpUpload()` produces wrong part sizes
(5 tests, incl. direct CSO reproduction above) |
| Cross-contamination checks | V8 using V12 values or vice versa (2 tests) |
| Source isolation | Wrong SDK imported in wrong package
(`AzureIsolationTest`) |
### Dual Test Infrastructure
Every integration test has two variants via abstract base classes:
| Suffix | Runner | Environment | Credentials |
|--------|--------|-------------|-------------|
| `*AzureTest` | Surefire (`mvn test`) | Live Azure | Required (skips via
`Assume` if absent) |
| `*AzuriteIT` | Failsafe (`mvn verify -PintegrationTesting`) | Azurite
(Docker) | None needed |
`AzureDataStoreAzuriteIT` is parameterized over `{V8, V12}` — 60 tests x 2
SDKs = 120 tests.
### Tests That Remain Live-Azure Only
`AzureDataRecordAccessProviderV12IT/Test` and `CDN` variant require HTTPS
presigned URIs or real CDN — deferred to Part 2.
---
## Critical Constraints
| # | Constraint | How It's Enforced |
|---|-----------|-------------------|
| 1 | V8 = 1.90.0 behavior | Diff gate against 1.90.0 source |
| 2 | V8/V12 constants self-contained | `AzureIsolationTest` + grep |
| 3 | No cross-package SDK imports | `AzureIsolationTest` (build-time) |
| 4 | No Azure SDK types in consumers | Grep verification |
| 5 | Single SDK selection point | `AzureSdkVersion.resolve()` only |
| 6 | CSO regression caught | 4-layer test coverage |
---
## Deferred to Part 2
- Azurite variants for presigned-URI tests (requires Azurite HTTPS support)
- ~400 lines of code duplication refactor (risky during active bug-fix)
- `throws Exception` cleanup on `AzureBlobContainer` interface
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]