He-Pin commented on code in PR #3035:
URL: https://github.com/apache/pekko/pull/3035#discussion_r3343514000


##########
stream/src/main/scala/org/apache/pekko/stream/stage/GraphStage.scala:
##########
@@ -275,6 +308,110 @@ object GraphStageLogic {
     type Receive = ((ActorRef, Any)) => Unit
   }
 
+  private object StageActor {
+    def localCell(ref: ActorRef, description: String): ActorCell =
+      ref match {
+        case ref: LocalActorRef       => ref.underlying
+        case ref: RepointableActorRef =>
+          ref.underlying match {
+            case cell: ActorCell => cell
+            case unknown         =>
+              throw new IllegalStateException(s"$description must be a local 
actor, was [${unknown.getClass.getName}]")
+          }
+        case unknown =>
+          throw new IllegalStateException(s"$description must be a local 
actor, was [${unknown.getClass.getName}]")
+      }
+
+    /**
+     * Reads `pekko.stream.materializer.stage-actor-drain-batch` from the 
materializer's ActorSystem config.
+     * Called once per lazy StageActor construction (never on the hot path). 
Bounded to `>= 1`.
+     */
+    def drainBatchSize(materializer: Materializer): Int =
+      Math.max(1, 
materializer.system.settings.config.getInt("pekko.stream.materializer.stage-actor-drain-batch"))
+
+    private final val SchedStateIdle: Int = 0
+    private final val SchedStateScheduled: Int = 1
+
+    /**
+     * Lazy-path dispatch: producers enqueue into a Vyukov MPSC queue and 
elect a single drain via
+     * IDLE -> SCHEDULED CAS; only the elected producer pays a mailbox 
enqueue. The drain runs on the
+     * interpreter thread, polls in a tight loop bounded by `drainBatchSize`, 
then either publishes IDLE
+     * (with a recheck for the publish-window race) or re-schedules another 
envelope to yield to other
+     * BoundaryEvents.
+     *
+     * JIT/GC notes:
+     *  - `final class` + monomorphic per-StageActor instance → JIT 
devirtualizes the apply at the
+     *    FunctionRef call site.
+     *  - Extends `AbstractNodeQueue` directly so the queue head atomic and 
the dispatch function share one
+     *    object (one allocation per StageActor, one fewer field deref on the 
producer hot path).
+     *  - All hot-path state is `private[this]` → direct field access, no 
accessor methods.
+     *  - `drainBatchSize` is read once into a stack-local at the top of 
`drain` so the JIT can treat the loop
+     *    bound as a constant.
+     *  - Per-tell allocation = 1 Node (`AbstractNodeQueue.Node`, ~24 bytes) + 
1 Tuple2 (~24 bytes). The
+     *    Tuple2 is forced by the public `StageActorRef.Receive` type. No 
AsyncInput / Envelope per tell —
+     *    those are amortized across the batch.
+     */
+    // Not marked `private` so that `class StageActor`'s aux constructor 
(compiled outside of the companion
+    // object on Scala 3) can reference it; the enclosing `object StageActor` 
is itself private.
+    final class LazyDispatch(
+        interpreter: GraphInterpreter,
+        logic: GraphStageLogic,
+        handler: Any => Unit,
+        drainBatchSize: Int)
+        extends AbstractNodeQueue[(ActorRef, Any)]
+        with (((ActorRef, Any)) => Unit) {
+
+      // IDLE/SCHEDULED election state. AtomicInteger gives us volatile read + 
CAS without the cross-Scala
+      // VarHandle / field-updater access fuss; the wrapper costs one extra 
reference per StageActor, which
+      // is negligible against the per-tell mailbox traffic we are saving.
+      private[this] val state = new AtomicInteger(SchedStateIdle)
+
+      // Reused across all drain batches; allocated once at construction.
+      private[this] val drainCallback: Any => Unit = (_: Any) => drain()
+
+      override def apply(pair: (ActorRef, Any)): Unit = {
+        add(pair) // Vyukov producer path: getAndSet + release-store, no CAS 
spin
+        // Double-checked CAS: uncontended fast path is one volatile read; 
only the IDLE->SCHEDULED winner
+        // pays a CAS + mailbox push.
+        if (state.get() == SchedStateIdle && 
state.compareAndSet(SchedStateIdle, SchedStateScheduled))
+          scheduleDrain()
+      }

Review Comment:
   Fixed in da487e3c74: producer-side guard added in `LazyDispatch.apply`. It 
now pre-checks `interpreter.isStageCompleted(logic)` and drops the message 
(matching the original per-tell behaviour where `runAsyncInput` silently 
ignored post-completion sends), and re-checks after winning the IDLE→SCHEDULED 
CAS to reset state if completion landed in the race window. Bounds any leak to 
messages enqueued before completion becomes visible to the producer thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to