potiuk commented on issue #24:
URL: https://github.com/apache/tooling-agents/issues/24#issuecomment-4399547701

   Thanks @andrewmusselman — went through the L3 task-sdk report at `95bbf6a` 
(122 findings). Triage below, same structure as the airflow-core L3 thread.
   
   The auditor noted AGENTS.md context wasn't loaded for this run, which shows 
in the volume — many of the by-design categories I flagged in the L1 triage 
above re-surfaced under new IDs.
   
   ## Theme overlap with prior L1 triage (no further action)
   
   | Theme | Old L1 ID | New L3 ID | Status |
   |---|---|---|---|
   | `_match_regexp` p.match → fullmatch | old F-005 | not in new report | ✅ 
merged in apache/airflow#66499 |
   | HTTP scheme on `base_url` (admin-configured) | old F-001 | F-003 | by 
design |
   | `_NullFernet` silent dev fallback | old F-004 | F-018, F-019 | documented; 
warning already emitted |
   | IPC frame size — same trust domain | old F-006 | F-008 (DoS), F-010 
(string fields), F-027 (rate limit), F-028 (FD exhaustion) | by design (worker 
is the trust domain) |
   | Worker / DAG code execution boundary | old F-011 | F-024, F-025 | by 
design (DAG authors execute arbitrary Python by definition) |
   | Terminal state guards / 401-403 backoff | old F-008/009/014/015 | F-006, 
F-007 (specific scenarios — see below) | partially extends prior triage |
   | Multi-team task-level isolation | old F-012/016 | F-014, F-016, F-040, 
F-053 | known limitation per security model |
   | URL encoding in Execution API path construction | old F-007 | F-039, F-046 
| hygiene only — admin-configured `allowed_run_id_pattern` |
   
   ## New actionable findings — investigating / fixing
   
   | # | Finding | Verdict | Plan |
   |---|---|---|---|
   | **F-005** | `httpx.ConnectError`/`TimeoutException` not caught by `except 
ServerResponseError` in `supervisor.py:754` — generator crashes, IPC channel 
permanently broken | **Real reliability bug.** Network exceptions during a 
transient API outage shouldn't kill the worker's IPC channel. | Add catch-all 
in `handle_requests` returning `ErrorResponse` |
   | **F-006** | Failed task marked SUCCESS when terminal-state IPC `send` 
fails in `task_runner.py` `finally` block — process exits 0, supervisor's 
`final_state` returns `SUCCESS` | **Real correctness bug** (downstream pipeline 
runs on bad upstream). | `sys.exit(1)` on send failure |
   | **F-007** | `_terminal_state` set BEFORE `client.task_instances.succeed()` 
succeeds; on transient API failure task gets stuck RUNNING on server forever | 
**Real reliability bug** affecting all `STATES_SENT_DIRECTLY`. | Set 
`_terminal_state` only after API call succeeds, or implement retry-on-failure |
   | **F-004** | Remote-log-handler load failures silently return; operators 
have no signal that logs aren't reaching the remote system | **Logging 
blind-spot fix.** Small structured-warning addition. | Add 
`log.warning("remote_log_handler_unavailable", …)` |
   | **F-017** | Both `get_connection()` and `get_variable()` treat all 
`ErrorResponse` types identically; an authz-denied response falls through to 
`EnvironmentVariablesBackend` which doesn't check authz, defeating the explicit 
deny | **Real authz fallthrough bug.** Distinguish "not found" from "denied" in 
the response and raise on denied to prevent backend fallback. | Differentiate 
error types + raise on authz failure |
   
   I'll open separate PRs for each and link them back to this comment.
   
   ## Acknowledged but not actionable
   
   | # | Finding | Why |
   |---|---|---|
   | F-001 | `NativeEnvironment` bypasses Jinja sandbox when 
`render_template_as_native_obj=True` | DAG authors execute arbitrary Python by 
design — they don't need a sandbox bypass to run dangerous code. The "false 
confidence" claim about `is_safe_attribute()` becoming dead code in the native 
branch is fair, but the sandbox itself is a defense against rogue *templates*, 
not rogue *DAG authors*. The audit's recommended `SandboxedNativeEnvironment` 
mixin would change rendering semantics in user-visible ways and is worth an 
AIP-style design discussion, not a quiet code change. |
   | F-002 | No crypto agility — Fernet (AES-128-CBC + HMAC-SHA256) hardcoded | 
AES-128 is not deprecated — it remains a NIST-approved algorithm. Fernet 
provides authenticated encryption + a versioning byte. Adding a configurable 
encryption-backend abstraction is a substantial migration with data-format 
compatibility implications across providers and existing encrypted secrets; 
would be tracked as an AIP, not a defect fix. |
   | F-008, F-027, F-028 | Unbounded IPC sizes / no rate limit between 
supervisor and task subprocess | Both endpoints live in the same worker 
process; a misbehaving task already has full code execution. The IPC channel is 
not a privilege boundary in our model. (Same rationale as L1 F-006.) |
   | F-011, F-012, F-013 | Private-attribute access via `is_safe_attribute`, 
`jinja_environment_kwargs` overrides, `from_string()` no validation | Jinja 
sandbox hardening for DAG author code — same threat model as F-001. By design. |
   | F-026, F-029 | Unbounded polling / no quota on XCom/Variable writes | 
Tasks can already do anything that costs resources in the cluster. |
   
   ## By design — per Airflow's security model
   
   The remaining ~95 findings cluster:
   
   | Category | Findings | Why |
   |---|---|---|
   | Documentation deliverables required by ASVS L3 | F-020, F-021, F-023, 
F-030, F-031, F-032, F-033, F-034, F-037, F-041, F-052, F-054, F-058, F-064, 
F-066, F-068, F-074, F-077, F-082, F-085, F-093, F-095 | Inventory / 
classification / rotation-schedule docs that L3 asks for. Worth filing 
individually as doc-improvement issues if any are blocking your audit; not 
security defects. |
   | Multi-team task-level isolation | F-014, F-016, F-040, F-053 | Documented 
known limitation in the security model. |
   | Logging architecture (encoding, "who" metadata, separation of audit vs 
event logs) | F-015, F-035, F-036, F-038, F-070, F-071, F-073 | Substantive 
improvements but deployment-shape choices, not code-level defects. |
   | Rate limiting / DoS within worker trust domain | F-008, F-026, F-027, 
F-028, F-029 | Trust domain — same as L1 F-006 rationale. |
   | Crypto-policy / key-management documentation | F-018, F-019, F-020, F-021, 
F-042, F-043, F-079, F-094 | Same architectural rationale as F-002. |
   | Reverse-proxy / TLS / cipher / mTLS hardening | F-021, F-044, F-045, 
F-098, F-099, F-100, F-101, F-103 | Deployment-manager responsibility. |
   | Supply-chain / SBOM / dependency provenance | F-022, F-023, F-024, F-025, 
F-067 | Pip/SBOM tooling layer, not the runtime SDK. |
   | Premise factually wrong / requires Python ≥ 3.10 default | various 
TLS-version findings | Already covered in L1 triage. |
   
   ## Patterns worth feeding back to the scanner (same notes as L1)
   
   1. **Trust boundaries** — many findings re-assume task-subprocess ↔ 
supervisor is a privilege boundary. It isn't, in our model.
   2. **DAG authors execute arbitrary Python** — Jinja sandbox findings keep 
treating DAG authors as untrusted; they aren't.
   3. **Multi-team task-level isolation** — explicitly documented as "not 
enforced; tracked for future versions; should not be reported as new findings".
   4. **Distinguish "AIP-grade architecture change" from "code-level defect"** 
— F-001 (Jinja native) and F-002 (crypto agility) are the former, not the 
latter.
   
   Once AGENTS.md is in scanner context for the next run a lot of these should 
drop out. Happy to dig deeper on any specific one if my reasoning above looks 
off.
   
   ---
   This response was generated by AI and may contain mistakes. After you've had 
a chance to look at the PRs and respond, a real Airflow maintainer (human) will 
follow up.
   
   ---
   Drafted-by: Claude Opus 4.7 (1M context); reviewed by @potiuk before posting
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to