[ https://issues.apache.org/jira/browse/DRILL-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978120#comment-16978120 ]
Paul Rogers commented on DRILL-7451: ------------------------------------ It appears that the actual behavior is a bit more complex. Run the same test as above, with the same query, but now mark the plugin as projection pushdown is *not* supported. In this case we get two projects. This suggests that the project above is added for a different reason, but it is still trivial and should be removed. Logical plan with scan project pushdown disabled: {code:json} "graph" : [ { "pop" : "DummyGroupScan", "@id" : 3, "columns" : [ "`**`" ], "userName" : "progers", "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 10000.0 } }, { "pop" : "project", "@id" : 2, "exprs" : [ { "ref" : "`a`", "expr" : "`a`" }, { "ref" : "`b`", "expr" : "`b`" }, { "ref" : "`c`", "expr" : "`c`" } ], "child" : 3, "outputProj" : true, "initialAllocation" : 1000000, "maxAllocation" : 10000000000, "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 10000.0 } }, { "pop" : "project", "@id" : 1, "exprs" : [ { "ref" : "`a`", "expr" : "`a`" }, { "ref" : "`b`", "expr" : "`b`" }, { "ref" : "`c`", "expr" : "`c`" } ], "child" : 2, "outputProj" : true, "initialAllocation" : 1000000, "maxAllocation" : 10000000000, "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 10000.0 } }, { "pop" : "screen", "@id" : 0, "child" : 1, "initialAllocation" : 1000000, "maxAllocation" : 10000000000, "cost" : { "memoryCost" : 1.6777216E7, "outputRowCount" : 10000.0 } } ] {code} > Planner inserts project node even if scan handles project push-down > ------------------------------------------------------------------- > > Key: DRILL-7451 > URL: https://issues.apache.org/jira/browse/DRILL-7451 > Project: Apache Drill > Issue Type: Bug > Reporter: Paul Rogers > Priority: Minor > > I created a "dummy" storage plugin for testing. The test does a simple query: > {code:sql} > SELECT a, b, c from dummy.myTable > {code} > The first test is to mark the plugin's group scan as supporting projection > push down. However, Drill still creates a projection node in the logical plan: > {code:json} > "graph" : [ { > "pop" : "DummyGroupScan", > "@id" : 2, > "columns" : [ "`**`" ], > "userName" : "progers", > "cost" : { > "memoryCost" : 1.6777216E7, > "outputRowCount" : 10000.0 > } > }, { > "pop" : "project", > "@id" : 1, > "exprs" : [ { > "ref" : "`a`", > "expr" : "`a`" > }, { > "ref" : "`b`", > "expr" : "`b`" > }, { > "ref" : "`c`", > "expr" : "`c`" > } ], > "child" : 2, > "outputProj" : true, > "initialAllocation" : 1000000, > "maxAllocation" : 10000000000, > "cost" : { > "memoryCost" : 1.6777216E7, > "outputRowCount" : 10000.0 > } > }, { > "pop" : "screen", > "@id" : 0, > "child" : 1, > "initialAllocation" : 1000000, > "maxAllocation" : 10000000000, > "cost" : { > "memoryCost" : 1.6777216E7, > "outputRowCount" : 10000.0 > } > } ] > {code} > There is [a comment in the > code|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java#L109] > that suggests the project should be removed: > {code:java} > // project above scan may be removed in ProjectRemoveRule for > // the case when it is trivial > {code} > As shown in the example, the project is trivial. There is a subtlety: it may > be that the scan, unknown to the planner, produce additional columns, say > {{d}} and {{e}} which the project operator is needed to remove. > If this is the reason the project remains, perhaps we can add a flag of some > kind where the group scan can insist that not only does it handle projection, > it will not insert additional columns. At that point, the project is > completely unnecessary in this case. > This is not a functional bug; just a performance issue: we exercise the > machinery of the project operator to do exactly nothing. -- This message was sent by Atlassian Jira (v8.3.4#803005)