[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

rdblue Wed, 31 Jan 2018 10:48:37 -0800

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/20387
  
    @cloud-fan, to your point about push-down order, I'm not saying that order 
doesn't matter at all, I'm saying that the push-down can run more than once and 
it should push the closest operators. That way, if you have a situation where 
operators can't be reordered but they can all be pushed, they all get pushed 
through multiple runs of the rule, each one further refining the relation.
    
    If we do it this way, then we don't need to traverse the logical plan to 
find out what to push down. We continue pushing projections until the plan 
stops changing. This is how the rest of the optimizer works, so I think it is a 
better approach from a design standpoint.
    
    My implementation also reuses more existing code that we have higher 
confidence in, which is a good thing. We can add things like limit pushdown 
later, by adding it properly to the existing code. I don't see a compelling 
reason to toss out the existing implementation, especially without the same 
level of testing.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20387: [SPARK-23203][SPARK-23204][SQL]: DataSourceV2: Use immut...

Reply via email to