o-nikolas commented on a change in pull request #16264:
URL: https://github.com/apache/airflow/pull/16264#discussion_r653908548
##########
File path: airflow/providers/microsoft/azure/log/wasb_task_handler.py
##########
@@ -100,8 +100,8 @@ def close(self) -> None:
with open(local_loc) as logfile:
log = logfile.read()
self.wasb_write(log, remote_loc, append=True)
-
- if self.delete_local_copy:
+ keep_local = conf.getboolean('logging', 'KEEP_LOCAL_LOGS')
+ if self.delete_local_copy or not keep_local:
Review comment:
I wonder if `delete_local_copy` is still needed now that you have
introduced this global behaviour?
##########
File path: airflow/providers/google/cloud/log/gcs_task_handler.py
##########
@@ -132,7 +134,10 @@ def close(self):
# read log and remove old logs to get just the latest additions
with open(local_loc) as logfile:
log = logfile.read()
- self.gcs_write(log, remote_loc)
+ success = self.gcs_write(log, remote_loc)
+ keep_local = conf.getboolean('logging', 'KEEP_LOCAL_LOGS')
+ if success and not keep_local:
+ shutil.rmtree(os.path.dirname(local_loc))
Review comment:
You're implementing the same cleanup recipe several times on the back of
a global config, both of which are indicators that this is a good candidate for
logic that should live in a super class. Doing it ad hoc like this leaves us
open for future developers of remote logging classes to forget or mis-implement
this logic. The individual remote logging classes should only be responsible
for doing the upload to their respective service, they shouldn't have to
re-implement the cleanup.
It is possible to teach `FileTaskHandler` to do this, but it would be tricky
to make it work in both cases and is a bit smelly. It's maybe time for a new
super class `RemoteTaskHandler`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]