[ 
https://issues.apache.org/jira/browse/DRILL-7451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978120#comment-16978120
 ] 

Paul Rogers commented on DRILL-7451:
------------------------------------

It appears that the actual behavior is a bit more complex. Run the same test as 
above, with the same query, but now mark the plugin as projection pushdown is 
*not* supported. In this case we get two projects. This suggests that the 
project above is added for a different reason, but it is still trivial and 
should be removed.

Logical plan with scan project pushdown disabled:

{code:json}
  "graph" : [ {
    "pop" : "DummyGroupScan",
    "@id" : 3,
    "columns" : [ "`**`" ],
    "userName" : "progers",
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "project",
    "@id" : 2,
    "exprs" : [ {
      "ref" : "`a`",
      "expr" : "`a`"
    }, {
      "ref" : "`b`",
      "expr" : "`b`"
    }, {
      "ref" : "`c`",
      "expr" : "`c`"
    } ],
    "child" : 3,
    "outputProj" : true,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "project",
    "@id" : 1,
    "exprs" : [ {
      "ref" : "`a`",
      "expr" : "`a`"
    }, {
      "ref" : "`b`",
      "expr" : "`b`"
    }, {
      "ref" : "`c`",
      "expr" : "`c`"
    } ],
    "child" : 2,
    "outputProj" : true,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  }, {
    "pop" : "screen",
    "@id" : 0,
    "child" : 1,
    "initialAllocation" : 1000000,
    "maxAllocation" : 10000000000,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10000.0
    }
  } ]
{code}


> Planner inserts project node even if scan handles project push-down
> -------------------------------------------------------------------
>
>                 Key: DRILL-7451
>                 URL: https://issues.apache.org/jira/browse/DRILL-7451
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Priority: Minor
>
> I created a "dummy" storage plugin for testing. The test does a simple query:
> {code:sql}
> SELECT a, b, c from dummy.myTable
> {code}
> The first test is to mark the plugin's group scan as supporting projection 
> push down. However, Drill still creates a projection node in the logical plan:
> {code:json}
>   "graph" : [ {
>     "pop" : "DummyGroupScan",
>     "@id" : 2,
>     "columns" : [ "`**`" ],
>     "userName" : "progers",
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   }, {
>     "pop" : "project",
>     "@id" : 1,
>     "exprs" : [ {
>       "ref" : "`a`",
>       "expr" : "`a`"
>     }, {
>       "ref" : "`b`",
>       "expr" : "`b`"
>     }, {
>       "ref" : "`c`",
>       "expr" : "`c`"
>     } ],
>     "child" : 2,
>     "outputProj" : true,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   }, {
>     "pop" : "screen",
>     "@id" : 0,
>     "child" : 1,
>     "initialAllocation" : 1000000,
>     "maxAllocation" : 10000000000,
>     "cost" : {
>       "memoryCost" : 1.6777216E7,
>       "outputRowCount" : 10000.0
>     }
>   } ]
> {code}
> There is [a comment in the 
> code|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillPushProjectIntoScanRule.java#L109]
>  that suggests the project should be removed:
> {code:java}
>         // project above scan may be removed in ProjectRemoveRule for
>         // the case when it is trivial
> {code}
> As shown in the example, the project is trivial. There is a subtlety: it may 
> be that the scan, unknown to the planner, produce additional columns, say 
> {{d}} and {{e}} which the project operator is needed to remove.
> If this is the reason the project remains, perhaps we can add a flag of some 
> kind where the group scan can insist that not only does it handle projection, 
> it will not insert additional columns. At that point, the project is 
> completely unnecessary in this case.
> This is not a functional bug; just a performance issue: we exercise the 
> machinery of the project operator to do exactly nothing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to