HeartSaVioR commented on a change in pull request #35680:
URL: https://github.com/apache/spark/pull/35680#discussion_r836229369
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -4061,10 +4061,23 @@ object SessionWindowing extends Rule[LogicalPlan] {
case s: SessionWindow => sessionAttr
}
+ val filterTimeSize = gapDuration.child.dataType match {
Review comment:
I guess we can simplify this a bit more like below:
```
val filterByTimeRange = session.gapDuration match {
case Literal(interval: CalendarInterval, CalendarIntervalType) =>
interval == null || interval.months + interval.days +
interval.microseconds <= 0
case _ => true
}
```
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSessionWindowingSuite.scala
##########
@@ -495,4 +495,65 @@ class DataFrameSessionWindowingSuite extends QueryTest
with SharedSparkSession
validateWindowColumnInSchema(schema2, "session")
}
}
+
+ test("SPARK-38349: No need to filter events when gapDuration greater than
0") {
+ // negative value
Review comment:
Logic for verification can be deduplicated via adding inner-method in
this test like below:
```
def checkFilterCondition(
logicalPlan: LogicalPlan,
expectTimeRange: Boolean,
assertHintMsg: String): Unit = {
val filter = logicalPlan.find { plan =>
plan.isInstanceOf[Filter] && plan.children.head.isInstanceOf[Project]
}
assert(filter.isDefined)
val exist = filter.get.expressions.flatMap { expr =>
expr.collect { case gt: GreaterThan => gt }
}
if (expectTimeRange) {
assert(exist.nonEmpty, assertHintMsg)
} else {
assert(exist.isEmpty, assertHintMsg)
}
}
```
and call the method per case.
Let's use logicalPlan (`df.queryExecution.logical`) instead of optimizedPlan
which Spark can optimize the plan and may have changed the plan out of the
expectation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]