Yep. Serializer is one place where fuzzing seems very appropriate - and it
squarely falls inte the 'pure' parts of airflow which have not much (except
some caching) side effects and input/output is clearly defined.

On Mon, Dec 22, 2025, 07:19 Amogh Desai <[email protected]> wrote:

> Thanks for addressing my questions, Leslie.
>
> I was sceptical about running it on Google infra but this point from Jarek:
>
> > Also, I know other ASF projects already rely on the OSS-Fuzz by Google,
> so
> there are no objections to using the tool from the ASF point of view - and
> it would definitely make it easier to start.
>
> clarifies that and I am not so sceptical anymore.
>
> I guess starting small here would be a good idea. Best to explore a *small
> area *of Airflow which
> doesn't make it hard to grok the reports of the fuzzer and would give us a
> hint if it's gonna be
> beneficial or not.
>
> Serialization is a good area, yes, but it's super vast and would require
> good prep time and for that
> reason I'd like to propose something small, like: *Connection URI parsing*.
> It is a pretty stable interface
> but I won't be surprised if some cases are not handled well there.
>
> Thanks & Regards,
> Amogh Desai
>
>
> On Mon, Dec 22, 2025 at 7:22 AM Leslie P. Polzer <[email protected]>
> wrote:
>
> > Hi everyone,
> >
> > On Sun, Dec 21, 2025, at 2:13 AM, Jarek Potiuk wrote:
> > > I think it's definitely worth trying. I saw a number of reports from
> > > fuzzing in other ASF projects - and they are sometimes useful and
> detect
> > > real issues.
> > >
> > > I think it would be great also that we treat it as a learning exercise
> -
> > > getting smaller PRs adding gradually some fuzzers from most obvious
> cases
> > > to the more complex ones - currently I think it's hard to imagine for
> us
> > > how such fuzzing could look like for Airflow and we would love to
> learn I
> > > think.
> >
> > Good point. Let me prepare a minimal PR for gradual introduction of
> fuzzing
> > into the codebase. How about starting with the serializer?
> >
> > Leslie
> >
> >
> > >
> > > I can easily imagine it for a bit more "lower-level" tools - libraries
> > > that are operating on well defined inputs and produce outputs as a
> result
> > > of processing the inputs with CLI or library call. Kind of "pure
> > functions"
> > > - which do not have state to start with and do not produce state
> > > side-effects.
> > >
> > > Airflow is more of a "living organism" where there is a lot of state -
> > both
> > > to begin with and the state gets updated as a result of various inputs.
> > So
> > > I have no good intuition on how such fuzzing could look like - but if
> an
> > > expert comes and proposes something, we can discuss it and give our
> > opinion
> > > if it makes sense - and learn how to - possibly - add more fuzzing on
> our
> > > own.
> > >
> > > Also, I know other ASF projects already rely on the OSS-Fuzz by Google,
> > so
> > > there are no objections to using the tool from the ASF point of view -
> > and
> > > it would definitely make it easier to start.
> > >
> > > One small thing that I see potentially as a blocker - is that if we
> start
> > > seeing a lot of false-positives, such fuzzing might become useless -
> > > especially if we have hard time analysing and understanding such
> fuzzing
> > > report - but if we start small, and include the learning path for us -
> I
> > am
> > > quite sure we can mitigate it.
> > >
> > > J.
> > >
> > >
> > > On Fri, Dec 19, 2025 at 9:59 AM Leslie P. Polzer <[email protected]>
> > > wrote:
> > >
> > > > Thanks for the thoughtful questions, Amogh. These are exactly the
> right
> > > > things to consider before committing resources. Let me address each
> > one:
> > > >
> > > > > 1. Where do these tests run? How long would it take to run? Any
> > > > > special needs? Cadence?
> > > >
> > > > The proposal is to integrate with **OSS-Fuzz**, Google's continuous
> > > > fuzzing infrastructure for open source projects.
> > > >
> > > > This means:
> > > >
> > > > - Tests run on Google's infrastructure at no cost to the project
> > > > - Fuzzing runs continuously 24/7, not blocking CI
> > > > - No special hardware or infrastructure needs from our side
> > > >
> > > > Optionally, fuzzers can run locally or in existing CI as quick sanity
> > > > checks (seconds to minutes), while deep fuzzing happens
> > > > asynchronously on OSS-Fuzz.
> > > >
> > > > > 2. I see an initial maintenance burden too - who will own it /
> > > > > maintain it? Who will triage the reports? (false positives,
> > > > > duplicates, low priority bugs)
> > > >
> > > > Once integrated, OSS-Fuzz operates autonomously. We have full control
> > > > over how findings are handled:
> > > >
> > > > - Bugs are reported to the **OSS-Fuzz dashboard**, not directly to
> our
> > > >   issue tracker
> > > > - We can **enable or disable** automatic GitHub issue creation
> > > > - Findings are private for 90 days, then become public if unfixed
> > > >
> > > > That 90-day window does create some pressure to address findings
> > > > - but the alternative is worse. These bugs exist whether or not we're
> > > > fuzzing. External researchers or attackers finding them first gives
> us
> > > > zero lead time. OSS-Fuzz guarantees we hear about it first, with 90
> > > > days to respond privately.
> > > >
> > > > I'll handle the **initial integration work** - writing the fuzzers,
> > > > setting up the OSS-Fuzz project config, verifying it runs. After
> that,
> > > > maintenance is minimal; fuzzers rarely need updates unless the APIs
> > > > they target change significantly.
> > > >
> > > > > 3. Airflow assumes trusted users, so some findings through the
> fuzzer
> > > > > might not be exploitable at all, but would lead to time spent
> > triaging
> > > > > that.
> > > >
> > > > Fair point. We can handle this carefully by scoping fuzzers to target
> > > > code paths where the security boundaries are simple - input parsing,
> > > > serialization, external protocol handling - and exclude areas where
> > > > Airflow's trusted user model means findings wouldn't be actionable.
> > > >
> > > > > 4. DAG runs user code end of the day, fuzzer may find issues in
> user
> > > > > code instead? Can we control that?
> > > >
> > > > Fuzzers work like regression tests - they target Airflow's own code
> > > > paths, not user DAGs. Just as our test suite imports and exercises
> > > > specific modules directly, fuzzers do the same:
> > > >
> > > > - Input parsing and validation functions
> > > > - Serialization/deserialization (pickle, JSON, etc.)
> > > > - Command construction utilities
> > > > - Connection parameter handling
> > > >
> > > > No DAG is ever loaded or executed. The fuzzer imports a function,
> feeds
> > > > it crafted inputs, and checks for crashes -- exactly like a unit
> test,
> > > > just with generated inputs instead of handwritten ones.
> > > >
> > > > > 5. Our ecosystem of tons of providers may require us to spend
> > > > > significant initial time to cover that surface area and later
> > > > > maintain it
> > > >
> > > > Agreed this is large. The proposal is not to fuzz all providers
> > > > immediately. Instead:
> > > >
> > > > - **Phase 1:** Core Airflow only (serializers, API input handling,
> > > >   scheduler internals)
> > > > - **Phase 2:** High-risk providers with shell/exec patterns (SSH,
> > > >   Docker, Kubernetes, Teradata)
> > > > - **Phase 3:** Community-driven expansion as we see value
> > > >
> > > > This mirrors how other large projects (Kubernetes, Envoy) adopted
> > > > fuzzing; start narrow, prove value, expand organically.
> > > >
> > > > The bottom line: With OSS-Fuzz handling infrastructure, the upfront
> > > > cost is a small PR and minimal ongoing commitment. We get 90 days of
> > > > private lead time on any bugs found - far better than the zero days
> > > > we'd get if external researchers find them first. Happy to start with
> > > > a minimal proof-of-concept targeting just the serialization layer if
> > > > that helps demonstrate value.
> > > >
> > > > Best,
> > > >
> > > > Leslie
> > > >
> > >
> >
>

Reply via email to