Hi,

also thanks for the proposal. Fuzzing was also brought to my attention in  a security seminar, so far I saw it useful but also a lot of erffort needs to be spent. It is clearly not a drop in-test-something-all-good. Soo I am all-in with what Amogh said.

Whereas I see real benefit if a good concept/idea for introduction is made. So would be looking (positive) forward for a proposal. (e.g. start fuzzing on the serialization code)

Jens

On 12/19/25 08:15, Amogh Desai wrote:
Thanks for bringing this to the devlist, Leslie.

I guess even before proceeding to questions about technical details and/or
implementation
related details, we should spend some time analyzing this from a broader
spectrum.

As I see it, it may offer few benefits such as:
1. Finding security issues earlier in the cycle
2. Would complement existing tests to cover more breadth

But there are few other things that should be considered here such as:
1. Where do these tests run? How long would it take to run? Any special
needs? Cadence?
2. I see an initial maintenance burden too - who will own it / maintain it?
Who will triage the reports?
(false positives, duplicates, low priority bugs)
3. Airflow assumes trusted users, so some findings through the fuzzer might
not be exploitable
at all, but would lead to time spent triaging that.
4. DAG runs user code end of the day, fuzzer may find issues in user code
instead? Can we control
that?
5. Our ecosystem of tons or providers may require us to spend
significant initial time to cover that
surface area and later maintain it

I would be more interested if these questions can be answered before
starting off with a PR.


Thanks & Regards,
Amogh Desai


On Fri, Dec 19, 2025 at 10:58 AM Leslie P. Polzer <[email protected]>
wrote:

Hi all,

I'd like to propose adding an upstream-owned OSS-Fuzz
fuzzer suite to Apache Airflow to improve the project's
security testing coverage.

Background:
Fuzzing is an effective technique for discovering bugs,
crashes, and potential security vulnerabilities by
automatically generating and testing malformed/unexpected
inputs. OSS-Fuzz is Google's continuous fuzzing service
for open source projects.

Proposed fuzz targets (using Atheris):
- DAG construction
- DAG serialization/deserialization
- Connection URI parsing
- YAML parsing
- Cron timetable parsing
- Params JSON schema validation
- API request body parsing/validation (connections,
   variables, trigger DAG run)

Each fuzzer would include tuned .options files (input
size limits) and small seed corpora. Structured targets
would also include .dict files.

Technical notes:
Since Airflow is a Python project, libFuzzer/Atheris is
the supported engine in OSS-Fuzz. MSan and alternate
engines are not applicable for Python targets.

I've prepared an implementation at:
https://github.com/apache/airflow/pull/59589

I would appreciate feedback on the approach before
proceeding further.

Questions for the community:
1. Is there interest in integrating continuous fuzzing
    into Airflow?
2. Are there other critical parsing/validation paths
    that should be prioritized for fuzzing?
3. Any concerns about the proposed directory structure
    (ossfuzz/)?

Thanks for your time and feedback.

Best,

Leslie

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to