yadavay-amzn commented on PR #56291:
URL: https://github.com/apache/spark/pull/56291#issuecomment-4626361610

   cc @cloud-fan @yaooqinn — this is a direct follow-up to SPARK-56546, would 
appreciate a look when convenient. The change extends the existing 
`SegmentTreeWindowFunctionFrame` to also handle shrinking frames (`... BETWEEN 
<lower> AND UNBOUNDED FOLLOWING`) by parameterizing it with `ubound: 
Option[BoundOrdering]` and a `fallbackFactory`; same eligibility gate, same 
memory accounting, same metrics. The benchmark numbers in the description show 
the algorithmic gap (8.5× at N=5K growing to 314× at N=50K, and the legacy 
O(N²) path becomes infeasible at N≥100K).
   
   Note: the fork-side CI's `pyspark-pandas` job has an MLflow doctest failure 
unrelated to this change (filesystem-backend deprecation in the installed 
mlflow). The relevant SQL / scalastyle / build matrices ran clean locally — 172 
tests pass across the new shrinking suite plus all pre-existing segtree and 
high-level window suites.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to