[ https://issues.apache.org/jira/browse/SPARK-10978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-10978: ------------------------------------ Assignee: Apache Spark > Allow PrunedFilterScan to eliminate predicates from further evaluation > ---------------------------------------------------------------------- > > Key: SPARK-10978 > URL: https://issues.apache.org/jira/browse/SPARK-10978 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 1.3.0, 1.4.0, 1.5.0 > Reporter: Russell Alexander Spitzer > Assignee: Apache Spark > Priority: Critical > > Currently PrunedFilterScan allows implementors to push down predicates to an > underlying datasource. This is done solely as an optimization as the > predicate will be reapplied on the Spark side as well. This allows for > bloom-filter like operations but ends up doing a redundant scan for those > sources which can do accurate pushdowns. > In addition it makes it difficult for underlying sources to accept queries > which reference non-existent to provide ancillary function. In our case we > allow a solr query to be passed in via a non-existent solr_query column. > Since this column is not returned when Spark does a filter on "solr_query" > nothing passes. > Suggestion on the ML from [~marmbrus] > {quote} > We have to try and maintain binary compatibility here, so probably the > easiest thing to do here would be to add a method to the class. Perhaps > something like: > def unhandledFilters(filters: Array[Filter]): Array[Filter] = filters > By default, this could return all filters so behavior would remain the same, > but specific implementations could override it. There is still a chance that > this would conflict with existing methods, but hopefully that would not be a > problem in practice. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org