JoshRosen commented on issue #25902: [SPARK-29213][SQL] Make it consistent when get notnull output and generate null checks in FilterExec URL: https://github.com/apache/spark/pull/25902#issuecomment-534953656 > Sorry, @JoshRosen , do you mind I use [your code](https://github.com/JoshRosen/spark/commit/d1658bbffb07d2b0c5f3aabc9362b397a2c0aeb8) to move test from sql/hive to sql/core ? Yes, please feel free to use that! This bug is somewhat hard to reproduce in unit tests because filter pushdown to data source scans will prevent the `FilterExec` code from processing the `null`-containing input rows. In d1658bbffb07d2b0c5f3aabc9362b397a2c0aeb8 I worked round that by using typed Dataset operations to create black-box operators through which the optimizer cannot push filters. (I'm following this PR because I'm interested in other aspects of `IsNotNull` optimizations; I have some notes on this particular code in #24765, something of a backburner project as I learn more about this code)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
