nandorKollar commented on a change in pull request #23855: [SPARK-26930][SQL] 
Tests in ParquetFilterSuite don't verify filter class
URL: https://github.com/apache/spark/pull/23855#discussion_r258880928
 
 

 ##########
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala
 ##########
 @@ -114,12 +114,24 @@ class ParquetFilterSuite extends QueryTest with 
ParquetTest with SharedSQLContex
             new SparkToParquetSchemaConverter(conf).convert(df.schema), pred)
           assert(maybeFilter.isDefined, s"Couldn't generate filter predicate 
for $pred")
           // Doesn't bother checking type parameters here (e.g. `Eq[Integer]`)
-          maybeFilter.exists(_.getClass === filterClass)
+          assert(flattenPredicateTree(maybeFilter.get).exists(_.getClass === 
filterClass))
         }
         checker(stripSparkFilter(query), expected)
     }
   }
 
+  private def flattenPredicateTree(filterPredicate: FilterPredicate): 
Seq[Object] = {
 
 Review comment:
   Ok, we can do this too, I like this idea. Actually after flattening the 
tree, it is easy to check for the entire predicate being pushed to Parquet: the 
tests should pass all predicates as a flattened tree as expected value in a 
sequence, and compare this with the flattened actual predicate tree. Most of 
the cases we are interested in only one predicate, so null filter rule just 
produces some noise in this case, however in cases like 
[this](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala#L182)
 it might provide some more value to check for the entire predicate tree, not 
just for the presence of 'or'.
   
   Anyway, this requires a lot more change though, I'm fine with your proposed 
solution!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to