cloud-fan commented on code in PR #56291:
URL: https://github.com/apache/spark/pull/56291#discussion_r3377978562
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/SegmentTreeWindowFunctionFrame.scala:
##########
@@ -100,8 +110,11 @@ private[window] final class SegmentTreeWindowFunctionFrame(
/**
* Runtime dispatch flag: when `true`, `write()`, `currentLowerBound()`, and
- * `currentUpperBound()` delegate to the wrapped
[[SlidingWindowFunctionFrame]]
- * (small-partition path). Set by `prepare()` based on partition size vs.
+ * `currentUpperBound()` delegate to the wrapped fallback frame produced by
+ * `fallbackFactory` (small-partition path). The fallback type is sliding-
Review Comment:
Nit: "sliding-dependent" is the wrong term — the fallback type depends on
the frame *shape* (moving vs shrinking), not on "sliding".
```suggestion
* `fallbackFactory` (small-partition path). The fallback type is shape-
```
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowEvaluatorFactoryBase.scala:
##########
@@ -281,11 +281,52 @@ trait WindowEvaluatorFactoryBase {
// Shrinking Frame.
case ("AGGREGATE", frameType, lower, UnboundedFollowing, _) =>
- target: InternalRow => {
- new UnboundedFollowingWindowFunctionFrame(
- target,
- processor,
- createBoundOrdering(frameType, lower, timeZone))
+ if (eligibleForSegTree(functions, aggFilters, frameType, conf)) {
+ val segFns = functions.map(_.asInstanceOf[DeclarativeAggregate])
+ // Shrinking frames touch the LRU only for one partial block at
Review Comment:
Nit (comment accuracy): the value `Some(2)` is correct, but the stated
reason isn't. This comment says shrinking frames "touch the LRU only for one
partial block at the lower edge", but `WindowSegmentTree.query` also fetches
the partition's **last** block through the LRU on every multi-block `[lower,
n)` query — the right-partial `mergeBlockRange(bhi, 0, ...)` calls
`ensureBlockLevels(bhi)` just like the lower-edge partial does. So the real
reason 2 suffices is: 1 slot for the monotonically-advancing lower-edge block +
1 for the always-resident last block (which stays hot because every query
touches it). As written, the comment would mislead someone into tuning the hint
down to 1, which would thrash — rebuilding the last block on every query.
Suggest rewording the rationale to reflect the two distinct blocks the LRU must
hold.
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/window/SegmentTreeWindowFunctionFrame.scala:
##########
@@ -206,17 +231,20 @@ private[window] final class
SegmentTreeWindowFunctionFrame(
private def writeRow(index: Int, current: InternalRow): Unit = {
var boundsChanged = index == 0
- // admit loop: extend upperBound; if a candidate is already below the
- // lower bound, advance lowerBound in lock-step to preserve invariant
- // (0 <= lowerBound <= upperBound <= tree.size).
- while (nextRow != null &&
- ubound.compare(nextRow, upperBound, current, index) <= 0) {
- if (lbound.compare(nextRow, lowerBound, current, index) < 0) {
- lowerBound += 1
+ if (!shrinking) {
Review Comment:
Nit (stale comment): the `writeRow`/`writeRange` header comment just above
(lines 224-230) now governs both shapes but still describes only the sliding
path. Two points: (1) it says the loops "run admit-then-drop", but the
shrinking path is drop-only — there is no admit loop (this `if (!shrinking)`
block is exactly what's skipped); (2) it says equivalence "is guarded by
`SegmentTreeWindowFunctionSuite` ... against the Sliding baseline" — for the
shrinking shape, equivalence is guarded by the new
`UnboundedFollowingSegmentTreeSuite` against the
`UnboundedFollowingWindowFunctionFrame` baseline. Worth extending the comment
to cover both shapes so a future maintainer of the shrinking path isn't
misdirected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]