potiuk commented on PR #41217: URL: https://github.com/apache/airflow/pull/41217#issuecomment-2268878951
Yeah. No strong concerns here -> just wondered how much extra overhead it will add to retrieve that information - generally speaking the problem with such debugging information is that it might change the behaviour of the system (like adding or avoiding race conditions - introducing things call "heisenbugs" - the more you look at the problem, the less likely it is to occur. So generrally speaking - as low overhead as possible in this kind of debugging facility - the better. In this sense - connecting it to airflow "debug" log level is not necessarily a good idea - because Airflow in debug log generates a loooooot of debugging information and this **might** impact how airflow behaves. So my concern here is to: a) limit the overhead when it is enabled (this can be done as well by caching the information - so maybe that will be enough to cache the facet. JUST retrieving all installed package information is pretty heavy, and we should do it once per interpreter run ideally. b) I **think** there should be a way to enable this logging independently from Airflow logs, because it might well be that just enabling debug logs will have other side effects. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
