ahshahid commented on PR #45446: URL: https://github.com/apache/spark/pull/45446#issuecomment-2002310326
@peter-toth @cloud-fan , IMHO the current idea of spark resolving the attribute to dataframe lower than the top level dataframe(s) , which in process adds missing attribute to various projections in between , can be detrimental to the performance without user being aware of the cause. The scenario which I have in mind is that say user had cached the lower dataframes. Now with the plan implicitly adding missing projects may make those cached plans unusable, without user being aware of the situation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
