HyukjinKwon edited a comment on issue #24675: [SPARK-27803][SQL][PYTHON] Fix column pruning for Python UDF URL: https://github.com/apache/spark/pull/24675#issuecomment-495591310 BTW, just to be sync'ed with you too @BryanCutler, @viirya and @icexelloss, I am planning to add a bunch of tests specific to regular Python UDF and Pandas Scalar UDF, which are possibly able to reused to Scala UDF too - I am trying to find a way to duplicate as much as possible. I hopefully it makes sense to you guys. This special rule `ExtractPythonUDF[s|FromAggregate]` has unevaluable expressions that always has to be wrapped with special plans. Seems like we remove some hacks now but I think we're not sure about the coverage. I think we started to observe those issues since we turn those Python ones from physical plans to logical plans, which was (I think) right fix but couldn't catch many cases like this. My idea is basically to share (or partially duplicate) *.sql files for Python / Pandas / Scala UDFs - hope this idea prevents such issues in the future.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
