[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

rdblue Mon, 05 Feb 2018 09:15:55 -0800

Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/20387
  
    > For safety, I wanna keep it unchanged, and start something new for data 
source v2 only.
    
    I disagree.
    
    * **#20476 addresses a bug caused by the new implementation that is not a 
problem if we reuse the current push-down code.** Using an entirely new 
implementation to push filters and projection is going to introduce bugs, and 
that problem demonstrates that it is a real risk.
    * **Using unreliable push-down code is going to make it more difficult for 
anyone to use the v2 API.**
    * **This approach throws away work that has accumulated over the past few 
years that give us confidence in the current push-down code.** The other code 
paths have push-down tests that will help us catch bugs in the new push-down 
logic. If we limit the scope of this change to v2, we will not be able to reuse 
those tests and will have to write entirely new ones that cover all cases.
    
    Lastly, I think it is clear that we need a design for a new push-down 
mechanism. **Adding this to DataSourceV2 as feature creep is not a good way to 
redesign it.** I'd like to see a design document that addresses some of the 
open questions.
    
    I'd also prefer that this new implementation be removed from the v2 code 
path for 2.3.0. @marmbrus, what do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...

Reply via email to