[GitHub] [spark] HyukjinKwon edited a comment on issue #24675: [SPARK-27803][SQL][PYTHON] Fix column pruning for Python UDF

GitBox Fri, 24 May 2019 04:53:37 -0700

HyukjinKwon edited a comment on issue #24675: [SPARK-27803][SQL][PYTHON] Fix 
column pruning for Python UDF
URL: https://github.com/apache/spark/pull/24675#issuecomment-495591310
 
 
   BTW, just to be sync'ed with you too @BryanCutler, @viirya and @icexelloss, 
I am planning to add a bunch of tests specific to regular Python UDF and Pandas 
Scalar UDF, which are possibly able to reused to Scala UDF too - I am trying to 
find a way to duplicate as much as possible. I hopefully it makes sense to you 
guys.
   
   This special rule `ExtractPythonUDF[s|FromAggregate]` has unevaluable 
expressions that always has to be wrapped with special plans. Seems like we 
remove some hacks now but I think we're not sure about the coverage.
   
   I think we started to observe those issues since we turn those Python ones 
from physical plans to logical plans, which was (I think) right fix but 
couldn't catch many cases like this. My idea is basically to share (or 
partially duplicate) *.sql files for Python / Pandas / Scala UDFs - hope this 
idea prevents such issues in the future.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon edited a comment on issue #24675: [SPARK-27803][SQL][PYTHON] Fix column pruning for Python UDF

Reply via email to