jedcunningham commented on code in PR #31955:
URL: https://github.com/apache/airflow/pull/31955#discussion_r1237039054


##########
scripts/docker/clean-logs.sh:
##########
@@ -21,19 +21,30 @@ set -euo pipefail
 
 readonly DIRECTORY="${AIRFLOW_HOME:-/usr/local/airflow}"
 readonly RETENTION="${AIRFLOW__LOG_RETENTION_DAYS:-15}"
+readonly DELETE_LOGS_FOLDERS="${AIRFLOW__LOG_DELETE_FOLDERS:-true}"
 
 trap "exit" INT TERM
 
 readonly EVERY=$((15*60))
 
+echo "Running the improved cleaning logs..."

Review Comment:
   ```suggestion
   ```
   
   Leftover from development? I don't think we need this :)



##########
scripts/docker/clean-logs.sh:
##########
@@ -21,19 +21,30 @@ set -euo pipefail
 
 readonly DIRECTORY="${AIRFLOW_HOME:-/usr/local/airflow}"
 readonly RETENTION="${AIRFLOW__LOG_RETENTION_DAYS:-15}"
+readonly DELETE_LOGS_FOLDERS="${AIRFLOW__LOG_DELETE_FOLDERS:-true}"

Review Comment:
   I almost feel like this should just happen regardless and not be able to be 
turned off. @potiuk, thoughts?



##########
scripts/docker/clean-logs.sh:
##########
@@ -21,19 +21,30 @@ set -euo pipefail
 
 readonly DIRECTORY="${AIRFLOW_HOME:-/usr/local/airflow}"
 readonly RETENTION="${AIRFLOW__LOG_RETENTION_DAYS:-15}"
+readonly DELETE_LOGS_FOLDERS="${AIRFLOW__LOG_DELETE_FOLDERS:-true}"
 
 trap "exit" INT TERM
 
 readonly EVERY=$((15*60))
 
+echo "Running the improved cleaning logs..."
 echo "Cleaning logs every $EVERY seconds"
 
 while true; do
-  echo "Trimming airflow logs to ${RETENTION} days."
-  find "${DIRECTORY}"/logs \
-    -type d -name 'lost+found' -prune -o \
-    -type f -mtime +"${RETENTION}" -name '*.log' -print0 | \
-    xargs -0 rm -f
+  if [[ "${DELETE_LOGS_FOLDERS}" == "true" ]]
+  then
+    echo "Trimming airflow logs folders to ${RETENTION} days."
+    find "${DIRECTORY}"/logs \
+      -type d -name 'lost+found' -prune -o \
+      -type d -mtime +"${RETENTION}" -print0 | \

Review Comment:
   This won't work as written. mtime updates don't propogate up, so you'd end 
up deleting all of a dags logs after the root folder has been around for 
$retention_days.
   
   Instead, we could have a second command that goes through and deletes old 
empty folders? I think the "empty" aspect is critical.



##########
chart/values.yaml:
##########
@@ -650,6 +650,7 @@ workers:
     args: ["bash", "/clean-logs"]
     # Number of days to retain logs
     retentionDays: 15
+    deleteLogFolders: true

Review Comment:
   If we do keep it confugurable, I think `deleteFolders` is better. Like with 
`retentionDays`, we know we are dealing with logs already.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to