[GitHub] [spark] IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

GitBox Wed, 29 May 2019 04:03:10 -0700

IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] 
Optimize away exponential complexity in ORC predicate conversion
URL: https://github.com/apache/spark/pull/24068#discussion_r288509968


 ##########
 File path: 
sql/core/v1.2.1/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 ##########
 @@ -63,55 +64,28 @@ private[sql] object OrcFilters extends OrcFiltersBase {
    */
   def createFilter(schema: StructType, filters: Seq[Filter]): 
Option[SearchArgument] = {
     val dataTypeMap = schema.map(f => f.name -> f.dataType).toMap
+    val orcFilterConverter = new OrcFilterConverter(dataTypeMap)
     for {
       // Combines all convertible filters using `And` to produce a single 
conjunction
       conjunction <- buildTree(convertibleFilters(schema, dataTypeMap, 
filters))
 
 Review comment:
   Changed. However, we still need to keep the `convertibleFilters` method 
because `OrcScanBuilder` uses it now.
   
   The right thing to do is probably to just make `OrcFilters` a class with 
`schema` and `filters` passed in as constructor parameters. However, this is an 
interface change that would require me to change about ~30 call sites to do, so 
I don't want to add it to this PR. If you think it's a good idea, I'd be happy 
to open a new PR with that change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] IvanVergiliev commented on a change in pull request #24068: [SPARK-27105][SQL] Optimize away exponential complexity in ORC predicate conversion

Reply via email to