nandorKollar commented on a change in pull request #23855: [SPARK-26930][SQL]
Tests in ParquetFilterSuite don't verify filter class
URL: https://github.com/apache/spark/pull/23855#discussion_r258880928
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
##########
@@ -114,12 +114,24 @@ class ParquetFilterSuite extends QueryTest with
ParquetTest with SharedSQLContex
new SparkToParquetSchemaConverter(conf).convert(df.schema), pred)
assert(maybeFilter.isDefined, s"Couldn't generate filter predicate
for $pred")
// Doesn't bother checking type parameters here (e.g. `Eq[Integer]`)
- maybeFilter.exists(_.getClass === filterClass)
+ assert(flattenPredicateTree(maybeFilter.get).exists(_.getClass ===
filterClass))
}
checker(stripSparkFilter(query), expected)
}
}
+ private def flattenPredicateTree(filterPredicate: FilterPredicate):
Seq[Object] = {
Review comment:
Ok, we can do this too, I like this idea. Actually after flattening the
tree, it is easy to check for the entire predicate being pushed to Parquet: the
tests should pass all predicates as a flattened tree as expected value in a
sequence, and compare this with the flattened actual predicate tree. Most of
the cases we are interested in only one predicate, so null filter rule just
produces some noise in this case, however in cases like
[this](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala#L182)
it might provide some more value to check for the entire predicate tree, not
just for the presence of 'or'.
Anyway, this requires a lot more change though, I'm fine with your proposed
solution!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]