cloud-fan commented on code in PR #56291:
URL: https://github.com/apache/spark/pull/56291#discussion_r3377978562


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/SegmentTreeWindowFunctionFrame.scala:
##########
@@ -100,8 +110,11 @@ private[window] final class SegmentTreeWindowFunctionFrame(
 
   /**
    * Runtime dispatch flag: when `true`, `write()`, `currentLowerBound()`, and
-   * `currentUpperBound()` delegate to the wrapped 
[[SlidingWindowFunctionFrame]]
-   * (small-partition path). Set by `prepare()` based on partition size vs.
+   * `currentUpperBound()` delegate to the wrapped fallback frame produced by
+   * `fallbackFactory` (small-partition path). The fallback type is sliding-

Review Comment:
   Nit: "sliding-dependent" is the wrong term — the fallback type depends on 
the frame *shape* (moving vs shrinking), not on "sliding".
   ```suggestion
      * `fallbackFactory` (small-partition path). The fallback type is shape-
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowEvaluatorFactoryBase.scala:
##########
@@ -281,11 +281,52 @@ trait WindowEvaluatorFactoryBase {
 
           // Shrinking Frame.
           case ("AGGREGATE", frameType, lower, UnboundedFollowing, _) =>
-            target: InternalRow => {
-              new UnboundedFollowingWindowFunctionFrame(
-                target,
-                processor,
-                createBoundOrdering(frameType, lower, timeZone))
+            if (eligibleForSegTree(functions, aggFilters, frameType, conf)) {
+              val segFns = functions.map(_.asInstanceOf[DeclarativeAggregate])
+              // Shrinking frames touch the LRU only for one partial block at

Review Comment:
   Nit (comment accuracy): the value `Some(2)` is correct, but the stated 
reason isn't. This comment says shrinking frames "touch the LRU only for one 
partial block at the lower edge", but `WindowSegmentTree.query` also fetches 
the partition's **last** block through the LRU on every multi-block `[lower, 
n)` query — the right-partial `mergeBlockRange(bhi, 0, ...)` calls 
`ensureBlockLevels(bhi)` just like the lower-edge partial does. So the real 
reason 2 suffices is: 1 slot for the monotonically-advancing lower-edge block + 
1 for the always-resident last block (which stays hot because every query 
touches it). As written, the comment would mislead someone into tuning the hint 
down to 1, which would thrash — rebuilding the last block on every query. 
Suggest rewording the rationale to reflect the two distinct blocks the LRU must 
hold.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/SegmentTreeWindowFunctionFrame.scala:
##########
@@ -206,17 +231,20 @@ private[window] final class 
SegmentTreeWindowFunctionFrame(
   private def writeRow(index: Int, current: InternalRow): Unit = {
     var boundsChanged = index == 0
 
-    // admit loop: extend upperBound; if a candidate is already below the
-    // lower bound, advance lowerBound in lock-step to preserve invariant
-    // (0 <= lowerBound <= upperBound <= tree.size).
-    while (nextRow != null &&
-        ubound.compare(nextRow, upperBound, current, index) <= 0) {
-      if (lbound.compare(nextRow, lowerBound, current, index) < 0) {
-        lowerBound += 1
+    if (!shrinking) {

Review Comment:
   Nit (stale comment): the `writeRow`/`writeRange` header comment just above 
(lines 224-230) now governs both shapes but still describes only the sliding 
path. Two points: (1) it says the loops "run admit-then-drop", but the 
shrinking path is drop-only — there is no admit loop (this `if (!shrinking)` 
block is exactly what's skipped); (2) it says equivalence "is guarded by 
`SegmentTreeWindowFunctionSuite` ... against the Sliding baseline" — for the 
shrinking shape, equivalence is guarded by the new 
`UnboundedFollowingSegmentTreeSuite` against the 
`UnboundedFollowingWindowFunctionFrame` baseline. Worth extending the comment 
to cover both shapes so a future maintainer of the shrinking path isn't 
misdirected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to