potiuk commented on code in PR #60148:
URL: https://github.com/apache/airflow/pull/60148#discussion_r2669305470
##########
ossfuzz/README.md:
##########
@@ -0,0 +1,70 @@
+# Airflow OSS-Fuzz fuzzers
+
+This directory contains the upstream-owned fuzz targets used by OSS-Fuzz for
+Apache Airflow.
+
+## Security Model Alignment
+
+These fuzzers target code paths with **clear security boundaries** per
+Airflow's [security model](../airflow-core/docs/security/security_model.rst):
+
+- **DAG Serialization/Deserialization**: Used by Scheduler and API Server with
+ schema validation. Input comes from DAG parsing and caching.
+- **Connection URI Parsing**: Used when creating/updating connections via API.
+
+We explicitly **avoid** fuzzing code paths in the "DAG author trust zone"
+where Airflow's policy is that DAG authors can execute arbitrary code.
+
+## What's here
+
+- `*_fuzz.py`: Atheris fuzz targets (packaged by OSS-Fuzz via `pyinstaller`).
+- `*.dict`: Optional libFuzzer dictionaries for structured inputs.
+- `*.options`: libFuzzer options (e.g. `max_len`) tuned per target.
+- `seed_corpus/<fuzzer>/...`: Small seed corpora that get zipped and uploaded
to
+ OSS-Fuzz for each target.
+
+## Fuzzers
+
+| Fuzzer | Target | Security Boundary |
+|--------|--------|-------------------|
+| `serialized_dag_fuzz.py` | `DagSerialization.from_dict()` | Schema
validation |
+| `connection_uri_fuzz.py` | `Connection._parse_from_uri()` | API input
validation |
+
+## Supported engines / sanitizers (Python constraints)
+
+Airflow is fuzzed as a **Python** OSS-Fuzz project. Practically, this means:
+
+- **Fuzzing engine**: `libfuzzer` (Atheris). Other engines (AFL/honggfuzz) are
+ not typically used/supported for Python targets in OSS-Fuzz.
+- **Sanitizers**: `address`, `undefined`, `coverage`, `introspector` are the
+ relevant modes. **MSan (`memory`) is not supported** for Python OSS-Fuzz
+ projects.
+
+## Running locally with OSS-Fuzz helper
+
+From a checkout of `google/oss-fuzz`:
+
+```bash
+# Build + basic validation:
+python3 infra/helper.py build_fuzzers --clean --sanitizer address airflow
/path/to/airflow
+python3 infra/helper.py check_build --sanitizer address airflow
+
+# Coverage build + validation:
+python3 infra/helper.py build_fuzzers --clean --sanitizer coverage airflow
/path/to/airflow
+python3 infra/helper.py check_build --sanitizer coverage airflow
+```
+
+## Running locally without OSS-Fuzz
Review Comment:
That's the error:
```
#47 30.13 × Failed to build `atheris==3.0.0`
#47 30.13 ├─▶ The build backend returned an error
#47 30.13 ╰─▶ Call to `setuptools.build_meta:__legacy__.build_wheel`
failed (exit
#47 30.13 status: 1)
#47 30.13
#47 30.13 [stdout]
#47 30.13 running bdist_wheel
#47 30.13 running build
#47 30.13 running build_py
#47 30.13 creating build/lib.linux-x86_64-cpython-310
#47 30.13 copying atheris_no_libfuzzer.py ->
build/lib.linux-x86_64-cpython-310
#47 30.13 creating build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/version_dependent.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/instrument_bytecode.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/custom_mutator_and_crossover_fuzz_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/function_hooks.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/regex_match_generation_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/str_hook_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/instrument_bytecode_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/custom_crossover_fuzz_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/regex_coverage_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/fuzzed_data_provider_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/custom_mutator_fuzz_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/pyinstaller_coverage_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/fuzz_test.py ->
build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/utils.py ->
build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/hook-atheris.py ->
build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/fuzz_test_lib.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/import_hook.py ->
build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/instrument_all_test.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 copying src/__init__.py ->
build/lib.linux-x86_64-cpython-310/atheris
#47 30.13 creating
build/lib.linux-x86_64-cpython-310/atheris/mock_libfuzzer
#47 30.13 copying src/mock_libfuzzer/mockutils.py ->
#47 30.13 build/lib.linux-x86_64-cpython-310/atheris/mock_libfuzzer
#47 30.13 running build_ext
#47 30.13
#47 30.13 [stderr]
#47 30.13
/root/.cache/uv/sdists-v9/pypi/atheris/3.0.0/pabmFBgV1TmQhP9YD0IsY/src/setup_utils/find_libfuzzer.sh:
#47 30.13 line 44: clang: command not found
#47 30.13 Failed to find libFuzzer; set either $CLANG_BIN to point
to your Clang
#47 30.13 binary, or $LIBFUZZER_LIB to point directly to your
libFuzzer .a file.
#47 30.13 If needed, download and build the latest version of Clang:
#47 30.13 git clone --depth=1
https://github.com/llvm/llvm-project.git
#47 30.13 cd llvm-project
#47 30.13 cmake -DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -G
"Unix Makefiles"
#47 30.13 -S llvm -B build
#47 30.13 NPROC=$(sysctl -n hw.logicalcpu 2>/dev/null || nproc)
#47 30.13 cmake --build build --parallel $NPROC # This step is
very slow.
#47 30.13 Then, set $CLANG_BIN="$(pwd)/bin/clang" and run pip again.
#47 30.13 You should use this same Clang for building any Python
extensions you
#47 30.13 plan to fuzz.
````
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]