asf-tooling commented on issue #1136:
URL:
https://github.com/apache/tooling-trusted-releases/issues/1136#issuecomment-4407485565
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@751c2146`
**Type:** `refactor` • **Classification:** `actionable` •
**Confidence:** `high`
**Application domain(s):** `infrastructure`, `admin_operations`
### Summary
The issue requests moving expensive permission-scanning logic from
synchronous server startup to a background maintenance task. @sbp confirmed
`_app_dirs_setup` takes ~11 seconds due to recursive permission enforcement on
archives/unfinished directories. @alitheg made a start by (a) adding the daily
maintenance task infrastructure (the `_storage_maintenance` stub) and (b)
deferring a heavy cyclonedx import chain, cutting startup from 15s to 5s. The
remaining work is to move the actual permission enforcement logic into the
`_storage_maintenance` function, which currently contains only `pass`.
### Where this lives in the code today
#### `atr/server.py` — `_app_dirs_setup` (lines 168-207)
_needs modification_
This is the main bottleneck (~11s) because it recursively walks and chmod's
all files in archives and unfinished directories during synchronous app
creation.
```python
def _app_dirs_setup(state_dir_str: str, hot_reload: bool) -> None:
"""Setup application directories."""
if not os.path.isdir(state_dir_str):
raise RuntimeError(f"State directory not found: {state_dir_str}")
os.chdir(state_dir_str)
if hot_reload is False:
print(f"Working directory changed to: {os.getcwd()}")
# Note that the hypercorn directories are not managed by ATR
directories_to_ensure = [
pathlib.Path(state_dir_str) / "audit",
pathlib.Path(state_dir_str) / "cache",
pathlib.Path(state_dir_str) / "database",
pathlib.Path(state_dir_str) / "hypercorn" / "logs",
pathlib.Path(state_dir_str) / "hypercorn" / "secrets",
pathlib.Path(state_dir_str) / "logs",
pathlib.Path(state_dir_str) / "runtime",
pathlib.Path(state_dir_str) / "secrets" / "curated",
pathlib.Path(state_dir_str) / "secrets" / "generated",
pathlib.Path(paths.get_archives_dir()),
pathlib.Path(paths.get_downloads_dir()),
pathlib.Path(paths.get_finished_dir()),
pathlib.Path(paths.get_quarantined_dir()),
pathlib.Path(paths.get_tmp_dir()),
pathlib.Path(paths.get_unfinished_dir()),
]
archives_dir = pathlib.Path(paths.get_archives_dir())
unfinished_dir = pathlib.Path(paths.get_unfinished_dir())
enforce_permissions = not config.is_dev_environment()
for directory in directories_to_ensure:
directory.mkdir(parents=True, exist_ok=True)
if not enforce_permissions:
continue
# Some directories need custom permissions
if directory == archives_dir:
_enforce_archives_permissions(archives_dir)
elif directory == unfinished_dir:
_enforce_unfinished_permissions(unfinished_dir)
else:
util.chmod_directories(directory, permissions=0o755)
```
#### `atr/server.py` — `_enforce_archives_permissions` (lines 711-742)
_needs modification_
This function does expensive recursive directory traversal. It should be
moved to run as a background maintenance task rather than blocking startup.
```python
def _enforce_archives_permissions(archives_dir: pathlib.Path) -> None:
if not archives_dir.exists():
return
fixed_files = 0
fixed_dirs = 0
# Set ancestor directories of archive files to 755
for dirpath, _, _ in os.walk(archives_dir, topdown=True):
path = pathlib.Path(dirpath)
depth = len(path.relative_to(archives_dir).parts)
if depth < 3:
os.chmod(path, 0o755)
# Set archive files to 444
for file_path in archives_dir.rglob("*"):
if not file_path.is_file():
continue
depth = len(file_path.relative_to(archives_dir).parts)
if (depth >= 3) and (stat.S_IMODE(file_path.stat().st_mode) !=
0o444):
os.chmod(file_path, 0o444)
fixed_files += 1
# Set archive directories to 555
for dirpath, _, _ in os.walk(archives_dir, topdown=False):
path = pathlib.Path(dirpath)
depth = len(path.relative_to(archives_dir).parts)
if (depth >= 3) and (stat.S_IMODE(path.stat().st_mode) != 0o555):
os.chmod(path, 0o555)
fixed_dirs += 1
if (fixed_files > 0) or (fixed_dirs > 0):
log.info(f"Fixed archive permissions: {fixed_files} files to 0o444,
{fixed_dirs} directories to 0o555")
```
#### `atr/server.py` — `_enforce_unfinished_permissions` (lines 745-758)
_needs modification_
Another expensive recursive directory traversal that should be moved to the
maintenance task.
```python
def _enforce_unfinished_permissions(unfinished_dir: pathlib.Path) -> None:
# Set ancestor directories of revisions to 755
for dirpath, _dirnames, _filenames in os.walk(unfinished_dir,
topdown=True):
path = pathlib.Path(dirpath)
depth = len(path.relative_to(unfinished_dir).parts)
if depth < 3:
os.chmod(path, 0o755)
# Set revision directories and their descendants to 555
for dirpath, _dirnames, _filenames in os.walk(unfinished_dir,
topdown=False):
path = pathlib.Path(dirpath)
depth = len(path.relative_to(unfinished_dir).parts)
if depth >= 3:
os.chmod(path, 0o555)
```
#### `atr/tasks/maintenance.py` — `_storage_maintenance` (lines 89-90)
_extension point_
@alitheg added this stub as the designated location for the deferred
permission enforcement logic.
```python
async def _storage_maintenance():
pass
```
#### `atr/tasks/maintenance.py` — `run` (lines 40-65)
_currently does this_
The daily maintenance orchestrator already calls `_storage_maintenance()` —
it just needs the permission logic filled in.
```python
@checks.with_model(args.MaintenanceArgs)
async def run(task_args: args.MaintenanceArgs) -> results.Results | None:
"""Run maintenance."""
log.info(f"Starting maintenance (user: {task_args.asf_uid})")
try:
# We schedule first to keep the start time approximately consistent
# (and knowing this task will finish before it's re-scheduled)
await tasks.schedule_next(task_args.asf_uid,
task_args.next_schedule_seconds, tasks.run_maintenance)
await _expired_pats_maintenance()
await _session_data_maintenance()
await _storage_maintenance()
await _workflow_ssh_keys_maintenance()
log.info(
"Storage maintenance completed successfully",
)
return results.Maintenance(
kind="maintenance",
)
except Exception as e:
error_msg = f"Unexpected error during maintenance: {e!s}"
log.exception("Maintenance failed with unexpected error")
raise MaintenanceError(error_msg) from e
```
### Where new code would go
- `atr/tasks/maintenance.py` — after symbol _storage_maintenance
The permission enforcement logic needs to be implemented here, replacing
the `pass` statement.
### Proposed approach
The fix has two parts: (1) Remove the expensive permission enforcement from
`_app_dirs_setup` in production so it no longer blocks startup. Keep only the
cheap `mkdir` calls. (2) Implement the permission enforcement in
`_storage_maintenance()` which already runs daily via the maintenance task
system. The permission functions (`_enforce_archives_permissions` and
`_enforce_unfinished_permissions`) can be moved from `atr/server.py` to
`atr/tasks/maintenance.py` (or kept in server.py and imported), and called from
within `_storage_maintenance` using `asyncio.to_thread` since they are blocking
I/O operations. The `util.chmod_directories` call for standard directories
(0o755) is fast and can remain in startup, or also be deferred — the key cost
is the recursive walks of archives and unfinished dirs.
### Suggested patches
#### `atr/server.py`
Remove expensive permission enforcement from startup; keep only directory
creation.
````diff
--- a/atr/server.py
+++ b/atr/server.py
@@ -134,16 +134,10 @@
archives_dir = pathlib.Path(paths.get_archives_dir())
unfinished_dir = pathlib.Path(paths.get_unfinished_dir())
- enforce_permissions = not config.is_dev_environment()
for directory in directories_to_ensure:
directory.mkdir(parents=True, exist_ok=True)
- if not enforce_permissions:
- continue
- # Some directories need custom permissions
- if directory == archives_dir:
- _enforce_archives_permissions(archives_dir)
- elif directory == unfinished_dir:
- _enforce_unfinished_permissions(unfinished_dir)
- else:
- util.chmod_directories(directory, permissions=0o755)
+ # Permission enforcement is now handled by the daily maintenance
task
+ # (see atr/tasks/maintenance.py :: _storage_maintenance)
+ # This avoids blocking server startup for 10+ seconds on large
state dirs
````
#### `atr/tasks/maintenance.py`
Implement permission enforcement in the daily maintenance task, replacing
the empty stub.
````diff
--- a/atr/tasks/maintenance.py
+++ b/atr/tasks/maintenance.py
@@ -16,11 +16,16 @@
import datetime
+import os
+import pathlib
+import stat
import time
from typing import Final
import sqlmodel
import atr.db as db
+import atr.config as config
import atr.log as log
import atr.models.args as args
import atr.models.results as results
import atr.models.sql as sql
+import atr.paths as paths
import atr.tasks as tasks
import atr.tasks.checks as checks
+import atr.util as util
_EXPIRED_TOKEN_RETENTION_DAYS: Final[int] = 30
_FORM_ERROR_TTL_SECONDS: Final[int] = 24 * 60 * 60
@@ -88,5 +93,62 @@
await data.commit()
-async def _storage_maintenance():
- pass
+async def _storage_maintenance() -> None:
+ """Enforce file and directory permissions on storage directories."""
+ if config.is_dev_environment():
+ return
+
+ import asyncio
+
+ archives_dir = pathlib.Path(paths.get_archives_dir())
+ unfinished_dir = pathlib.Path(paths.get_unfinished_dir())
+
+ await asyncio.to_thread(_enforce_archives_permissions, archives_dir)
+ await asyncio.to_thread(_enforce_unfinished_permissions, unfinished_dir)
+
+
+def _enforce_archives_permissions(archives_dir: pathlib.Path) -> None:
+ if not archives_dir.exists():
+ return
+ fixed_files = 0
+ fixed_dirs = 0
+
+ # Set ancestor directories of archive files to 755
+ for dirpath, _, _ in os.walk(archives_dir, topdown=True):
+ path = pathlib.Path(dirpath)
+ depth = len(path.relative_to(archives_dir).parts)
+ if depth < 3:
+ os.chmod(path, 0o755)
+
+ # Set archive files to 444
+ for file_path in archives_dir.rglob("*"):
+ if not file_path.is_file():
+ continue
+ depth = len(file_path.relative_to(archives_dir).parts)
+ if (depth >= 3) and (stat.S_IMODE(file_path.stat().st_mode) !=
0o444):
+ os.chmod(file_path, 0o444)
+ fixed_files += 1
+
+ # Set archive directories to 555
+ for dirpath, _, _ in os.walk(archives_dir, topdown=False):
+ path = pathlib.Path(dirpath)
+ depth = len(path.relative_to(archives_dir).parts)
+ if (depth >= 3) and (stat.S_IMODE(path.stat().st_mode) != 0o555):
+ os.chmod(path, 0o555)
+ fixed_dirs += 1
+
+ if (fixed_files > 0) or (fixed_dirs > 0):
+ log.info(f"Fixed archive permissions: {fixed_files} files to 0o444,
{fixed_dirs} directories to 0o555")
+
+
+def _enforce_unfinished_permissions(unfinished_dir: pathlib.Path) -> None:
+ if not unfinished_dir.exists():
+ return
+
+ # Set ancestor directories of revisions to 755
+ for dirpath, _dirnames, _filenames in os.walk(unfinished_dir,
topdown=True):
+ path = pathlib.Path(dirpath)
+ depth = len(path.relative_to(unfinished_dir).parts)
+ if depth < 3:
+ os.chmod(path, 0o755)
+
+ # Set revision directories and their descendants to 555
+ for dirpath, _dirnames, _filenames in os.walk(unfinished_dir,
topdown=False):
+ path = pathlib.Path(dirpath)
+ depth = len(path.relative_to(unfinished_dir).parts)
+ if depth >= 3:
+ os.chmod(path, 0o555)
````
### Open questions
- Should the permission enforcement also run at startup (once, in
background) via `before_serving` to ensure security posture immediately, or is
daily maintenance sufficient?
- Should `_enforce_archives_permissions` and
`_enforce_unfinished_permissions` be removed from `atr/server.py` entirely, or
kept for potential manual/admin invocation?
- The `util.chmod_directories(directory, permissions=0o755)` call for
standard directories (audit, cache, logs, etc.) is likely fast — should it
remain in startup or also be deferred?
- Is there additional work needed for the cyclonedx import chain deferral
that @alitheg mentioned, or was that already committed?
### Files examined
- `atr/server.py`
- `atr/manager.py`
- `atr/worker.py`
- `scripts/check-perms`
- `start-atr.sh`
- `start-dev.sh`
- `atr/tasks/__init__.py`
- `atr/tasks/maintenance.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]