[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

rdblue Thu, 01 Feb 2018 12:40:02 -0800

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/20476
  
    Yeah, I did review it, but at the time I wasn't familiar with how the other 
code paths worked and assumed that it was necessary to introduce this. I wasn't 
very familiar with how it *should* work, so I didn't +1 it.
    
    There are a few telling comments though:
    
    > How do we know that there aren't more cases that need to be supported?
    
    > What are the guarantees made by the previous batches in the optimizer? 
The work done by FilterAndProject seems redundant to me because the optimizer 
should already push filters below projection. Is that not guaranteed by the 
time this runs?
    
    In any case, I now think that we should not introduce a new push-down 
design in conjunction with DSv2. Let's get DSv2 working properly and redesign 
push-down separately. In parallel is fine by me.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20476: [SPARK-23301][SQL] data source column pruning should wor...

Reply via email to