RussellSpitzer opened a new pull request, #16841: URL: https://github.com/apache/iceberg/pull/16841
## Why Some test harnesses (and other host runtimes) build multiple `InMemoryCatalog` objects that should logically refer to the same catalog — for example, a runtime that constructs a separate validation catalog alongside its primary one, or that clones a session and rebuilds its catalog manager. With the existing per-instance state, those objects don't see each other's writes. This PR adds a way for two instances to share a single in-memory backing store within the JVM. ## What - Add catalog property `in-memory-catalog.instance-id`. When two `InMemoryCatalog` instances are initialized with the same value, they share namespaces, tables, views, and the synchronization monitor used for writes. When unset, each instance keeps a private store as before. - Move per-instance maps into a private `Storage` holder. Hold a `volatile Storage` reference so re-initialization is safe to publish. - Lock on the `Storage` instance instead of `this`, so all instances pointing at the same shared store serialize through the same monitor (previously each instance synchronized on its own monitor — a real bug for any shared-state usage). - `initialize()` is now idempotent: each call selects storage based on whether `INSTANCE_ID` is present. - `close()` clears the maps only when the instance owns a private store; for shared stores it leaves state intact so other instances continue to observe it. The new behavior is documented on `close()`. - Add a package-private `clearInstanceState(String)` (annotated `@VisibleForTesting`) for explicit shared-store cleanup. It is intentionally not part of the public surface. ## Tests New tests in `TestInMemoryCatalog`: - `instanceIdSharesNamespacesAndTablesAcrossInstances` — bidirectional sharing of namespaces and tables. - `instanceIdSharesViewsAcrossInstances` — bidirectional sharing of views. - `distinctInstanceIdsKeepStateIsolated` — different ids stay isolated. - `instancesWithoutInstanceIdKeepStateIsolated` — baseline; two catalogs with no id remain isolated. - `closeClearsPrivateStateButPreservesSharedState` — pins down the new `close()` semantics in both modes. - `clearInstanceStateRemovesSharedStore` — contract test for the test hook. - `clearInstanceStateIsNoOpForUnknownId` — clearing an unregistered id is safe. - `initializeIsIdempotentWhenInstanceIdRemoved` — re-initializing without an id switches to a fresh private store. - `concurrentInitializeWithSameInstanceIdSharesStorage` — racing initializers converge on a single shared `Storage`. The existing `CatalogTests` suite (run via `TestInMemoryCatalog`) is unchanged and continues to pass. ## Compatibility - API: only additions (one public property constant, no method signature changes). `revapi` is clean. - Behavior: `close()` semantics change is gated on the new property — instances that don't set `INSTANCE_ID` behave exactly as before. ## Validation - `./gradlew :iceberg-core:test` passes (full module suite, ~20 min) - `./gradlew :iceberg-core:revapi` clean - `./gradlew :iceberg-core:spotlessCheck` clean ## Follow-up A separate PR will use this property in the Spark v4.1 test base to share a backing store between Spark's catalog and the validation catalog (current draft: [#16839](https://github.com/apache/iceberg/pull/16839); will be rebased on this once it merges). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
