[ https://issues.apache.org/jira/browse/SPARK-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991538#comment-14991538 ]
Vishesh Garg commented on SPARK-11390: -------------------------------------- I was under the impression that this is just a plan tree presentation issue, and that the filter was indeed getting pushed by calling the *PrunedFilteredScan.buildScan()* method. However now I'm not sure that's the case because the internal plan structure also seems to suggest otherwise: {noformat} == Physical Plan == TungstenAggregate(key=[], functions=[(count(1),mode=Final,isDistinct=false)], output=[count#3L]) TungstenExchange SinglePartition TungstenAggregate(key=[], functions=[(count(1),mode=Partial,isDistinct=false)], output=[currentCount#6L]) Project Filter (age#1 < 15) Scan OrcRelation[hdfs://localhost:9000/user/spec/people][age#1] Code Generation: true {noformat} Am I missing something here? > Query plan with/without filterPushdown indistinguishable > -------------------------------------------------------- > > Key: SPARK-11390 > URL: https://issues.apache.org/jira/browse/SPARK-11390 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.1 > Environment: All > Reporter: Vishesh Garg > Priority: Minor > > The execution plan of a query remains the same regardless of whether the > filterPushdown flag has been set to "true" or "false", as can be seen below: > ====== > scala> sqlContext.setConf("spark.sql.orc.filterPushdown", "false") > scala> sqlContext.sql("SELECT name FROM people WHERE age = 15").explain() > == Physical Plan == > Project [name#6] > Filter (age#7 = 15) > Scan OrcRelation[hdfs://localhost:9000/user/spec/people][name#6,age#7] > scala> sqlContext.setConf("spark.sql.orc.filterPushdown", "true") > scala> sqlContext.sql("SELECT name FROM people WHERE age = 15").explain() > == Physical Plan == > Project [name#6] > Filter (age#7 = 15) > Scan OrcRelation[hdfs://localhost:9000/user/spec/people][name#6,age#7] > ====== > Ideally, when the filterPushdown flag is set to "true", both the scan and the > filter nodes should be merged together to make it clear that the filtering is > being done by the data source itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org