cloud-fan commented on a change in pull request #24068: [SPARK-27105][SQL]
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r282045720
##########
File path:
sql/core/v2.3.4/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
##########
@@ -111,38 +111,73 @@ private[sql] object OrcFilters extends OrcFiltersBase {
case _ => value
}
+ /**
+ * A TrimmedFilter is a Filter that has been trimmed such that all the
remaining nodes
+ * are convertible to ORC predicates.
+ *
+ * Since nothing in the underlying representation of the Filter is actually
different from a
+ * regular Filter (the only difference is that we might remove some
subtrees), this class is just
+ * a wrapper around a `Filter` value. The main benefits of using this class
are readability
+ * and type safety (to signal that the respective functions only work with
already trimmed
+ * filters).
+ *
+ * @param filter The underlying filter representation.
+ */
+ private case class TrimmedFilter(filter: Filter) extends AnyVal
Review comment:
hmm I feel this one is not very useful. We just wrap the trimmer filter with
it and then throw it away when building ORC filter. I think we can add assert
when building ORC filter, to make sure the passed filter is already trimed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]