holdenk commented on pull request #29121: URL: https://github.com/apache/spark/pull/29121#issuecomment-664604888
So admitedly this is from StackOverflow but it sounds like unused imports in Python tend to have a negligible on-load impact - https://stackoverflow.com/questions/8724045/does-unused-import-and-objects-have-an-performance-impact That being said, if were in a situation where we've got a Python UDF and we're just starting the Python process briefly on each executor, I could see the aggregate of the "one time negligible load" being noticeable. Personally, regardless of the performance, I'm in favour of cleaning up unused imports and cleaning up our code even if it makes backporting a bit more painful. I think, given that we only backport bug fixes, we should focus on having a code base that is easy to develop in. And being clearer about what the code uses certainly makes it easier for me to form a mental model of a file. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
