[GitHub] [spark] JoshRosen commented on issue #24515: [SPARK-14083][WIP] Basic bytecode analyzer to speed up Datasets

GitBox Sun, 12 May 2019 14:10:23 -0700

JoshRosen commented on issue #24515: [SPARK-14083][WIP] Basic bytecode analyzer 
to speed up Datasets
URL: https://github.com/apache/spark/pull/24515#issuecomment-491629506
 
 
   In addition to the ideas discussed here, I think we should also benchmark 
the raw constant-factor overheads of UDF / UDAF / typed operations to see 
whether there's any straightforward optimizations that will speed up existing 
workloads without the added  complexity of bytecode analysis / closure 
conversion.
   
   For example, it looks like there's room for improvement in how we invoke 
UDFs with primitive input type arguments 
(https://issues.apache.org/jira/browse/SPARK-27684). Through careful 
benchmarking we might be able to uncover other low-hanging wins.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] JoshRosen commented on issue #24515: [SPARK-14083][WIP] Basic bytecode analyzer to speed up Datasets

Reply via email to