IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL]
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r288509968
##########
File path:
sql/core/v1.2.1/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
##########
@@ -63,55 +64,28 @@ private[sql] object OrcFilters extends OrcFiltersBase {
*/
def createFilter(schema: StructType, filters: Seq[Filter]):
Option[SearchArgument] = {
val dataTypeMap = schema.map(f => f.name -> f.dataType).toMap
+ val orcFilterConverter = new OrcFilterConverter(dataTypeMap)
for {
// Combines all convertible filters using `And` to produce a single
conjunction
conjunction <- buildTree(convertibleFilters(schema, dataTypeMap,
filters))
Review comment:
Changed. However, we still need to keep the `convertibleFilters` method
because `OrcScanBuilder` uses it now.
The right thing to do is probably to just make `OrcFilters` a class with
`schema` and `filters` passed in as constructor parameters. However, this is an
interface change that would require me to change about ~30 call sites to do, so
I don't want to add it to this PR. If you think it's a good idea, I'd be happy
to open a new PR with that change.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]