Thanks for bringing this to the devlist, Leslie. I guess even before proceeding to questions about technical details and/or implementation related details, we should spend some time analyzing this from a broader spectrum.
As I see it, it may offer few benefits such as: 1. Finding security issues earlier in the cycle 2. Would complement existing tests to cover more breadth But there are few other things that should be considered here such as: 1. Where do these tests run? How long would it take to run? Any special needs? Cadence? 2. I see an initial maintenance burden too - who will own it / maintain it? Who will triage the reports? (false positives, duplicates, low priority bugs) 3. Airflow assumes trusted users, so some findings through the fuzzer might not be exploitable at all, but would lead to time spent triaging that. 4. DAG runs user code end of the day, fuzzer may find issues in user code instead? Can we control that? 5. Our ecosystem of tons or providers may require us to spend significant initial time to cover that surface area and later maintain it I would be more interested if these questions can be answered before starting off with a PR. Thanks & Regards, Amogh Desai On Fri, Dec 19, 2025 at 10:58 AM Leslie P. Polzer <[email protected]> wrote: > Hi all, > > I'd like to propose adding an upstream-owned OSS-Fuzz > fuzzer suite to Apache Airflow to improve the project's > security testing coverage. > > Background: > Fuzzing is an effective technique for discovering bugs, > crashes, and potential security vulnerabilities by > automatically generating and testing malformed/unexpected > inputs. OSS-Fuzz is Google's continuous fuzzing service > for open source projects. > > Proposed fuzz targets (using Atheris): > - DAG construction > - DAG serialization/deserialization > - Connection URI parsing > - YAML parsing > - Cron timetable parsing > - Params JSON schema validation > - API request body parsing/validation (connections, > variables, trigger DAG run) > > Each fuzzer would include tuned .options files (input > size limits) and small seed corpora. Structured targets > would also include .dict files. > > Technical notes: > Since Airflow is a Python project, libFuzzer/Atheris is > the supported engine in OSS-Fuzz. MSan and alternate > engines are not applicable for Python targets. > > I've prepared an implementation at: > https://github.com/apache/airflow/pull/59589 > > I would appreciate feedback on the approach before > proceeding further. > > Questions for the community: > 1. Is there interest in integrating continuous fuzzing > into Airflow? > 2. Are there other critical parsing/validation paths > that should be prioritized for fuzzing? > 3. Any concerns about the proposed directory structure > (ossfuzz/)? > > Thanks for your time and feedback. > > Best, > > Leslie
