He-Pin opened a new pull request, #3008: URL: https://github.com/apache/pekko/pull/3008
### Motivation The migration from `sun.misc.Unsafe` to `VarHandle` (#1990, and #1894 for `Mailbox`) replaced the producer writes correctly (`compareAndSwapObject` → `compareAndSet`, `putOrderedObject` → `setRelease`) but silently **downgraded the corresponding reads** (and two writes) from volatile to plain access: - `Unsafe.getObjectVolatile` / `getIntVolatile` / `getLongVolatile` → `VarHandle.get` (plain) - `putObjectVolatile` / `putIntVolatile` → `VarHandle.set` (plain) `VarHandle.get` / `set` have **plain** memory semantics even when the field is declared `volatile`, so these accesses lost their happens-before guarantees with the concurrent `compareAndSet` / `setRelease` publications. On weakly-ordered hardware (AArch64) a reader can observe a stale value, and inside tight loops the plain read may be hoisted by the JIT. On x86-64 plain loads happen to have acquire semantics, which is why this largely went unnoticed. ### Modification Restore the original ordered semantics on every downgraded access: - **Mailbox**: `currentStatus` / `setStatus` and `systemQueueGet` (mailbox status and system-message queue head). - **ActorCell**: `mailbox` (Dispatch), `childrenRefs` / `functionRefs` reads and the `setTerminated` write (Children). - **RepointableActorRef**: `underlying` / `lookup` cell reads. - **CircuitBreaker**: `currentState` / `currentResetTimeout` reads. - **PromiseActorRef** (AskSupport): `state` / `watchedBy` reads and the `setState` write. - **MessageDispatcher**: `inhabitants` / `shutdownSchedule` reads. - **Artery Association**: `associationState` read. CAS-published fields use `getVolatile` / `setVolatile`, restoring the exact `getObjectVolatile` / `putObjectVolatile` semantics. For a load, `getVolatile` compiles to the same instruction as a plain load on x86-64 (`MOV`) and to `LDAR` on AArch64 — i.e. exactly what the original code emitted — so this restores the prior semantics at no extra cost on the read side. The two restored `setVolatile` writes carry a StoreLoad fence as they did before the migration. The lock-free node-queue spin reads (`AbstractNodeQueue` / `AbstractBoundedNodeQueue`) are handled in a separate PR, as they pair with release stores and use `getAcquire`. ### Result All concurrently-accessed fields that were volatile before the migration are volatile again, closing latent visibility races — most importantly the mailbox status / system-message queue and the actor cell/mailbox references. Method signatures are unchanged, so MiMa is unaffected. ### Tests - `sbt actor/compile remote/compile` succeed. ### References - https://github.com/apache/pekko/issues/2870 - https://github.com/apache/pekko/issues/2573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
