Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20387 > For safety, I wanna keep it unchanged, and start something new for data source v2 only. I disagree. * **#20476 addresses a bug caused by the new implementation that is not a problem if we reuse the current push-down code.** Using an entirely new implementation to push filters and projection is going to introduce bugs, and that problem demonstrates that it is a real risk. * **Using unreliable push-down code is going to make it more difficult for anyone to use the v2 API.** * **This approach throws away work that has accumulated over the past few years that give us confidence in the current push-down code.** The other code paths have push-down tests that will help us catch bugs in the new push-down logic. If we limit the scope of this change to v2, we will not be able to reuse those tests and will have to write entirely new ones that cover all cases. Lastly, I think it is clear that we need a design for a new push-down mechanism. **Adding this to DataSourceV2 as feature creep is not a good way to redesign it.** I'd like to see a design document that addresses some of the open questions. I'd also prefer that this new implementation be removed from the v2 code path for 2.3.0. @marmbrus, what do you think?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org