zml1206 commented on PR #46143: URL: https://github.com/apache/spark/pull/46143#issuecomment-3645788499
> Just wondering if we have a consesus on the best way to go about this @zml1206 / @cloud-fan ? I'm thinking based on the > 1 year since with change it might be more complicated than we originally thought. I can re-explore as well is @zml1206 is busy but we could also go for the simpler solution in the meantime since double UDF evaluation is bad. Sorry for the late reply. The first issue we encountered was nested expansion of filter expressions causing the plan to become too large, resulting in driver out-of-memory errors. Performance was also an issue. This pr seems to only address part of the performance problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
