kecookier commented on code in PR #7168:
URL: https://github.com/apache/incubator-gluten/pull/7168#discussion_r1751145852
##########
gluten-substrait/src/main/scala/org/apache/gluten/expression/WindowFunctionsBuilder.scala:
##########
@@ -31,7 +31,7 @@ object WindowFunctionsBuilder {
val substraitFunc = windowFunc match {
// Handle lag with negative inputOffset, e.g., converts lag(c1, -1) to
lead(c1, 1).
// Spark uses `-inputOffset` as `offset` for Lag function.
- case lag: Lag if lag.offset.eval(EmptyRow).asInstanceOf[Int] > 0 =>
Review Comment:
@PHILO-HE This PR addresses the lag result mismatch issue in my current
version of Spark. Thank you for your input. I have reviewed the Spark code, and
now I understand the differences.
In Spark 3.0, the LAG function calculates the bound using both the offset
and the direction. However, in versions post Spark 3.1, the function does not
consider the direction. Instead, it uses a single literal expression that
includes the offset of the current row to calculate the bound. For example, in
lag(), the offset will be wrapped with UnaryMinus(offset).
lag in spark 3.0
https://github.com/apache/spark/blob/branch-3.0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L474
lag in spark 3.2
https://github.com/apache/spark/blob/branch-3.2/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala#L538
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]