potiuk commented on code in PR #60148:
URL: https://github.com/apache/airflow/pull/60148#discussion_r2664860239


##########
ossfuzz/README.md:
##########
@@ -0,0 +1,70 @@
+# Airflow OSS-Fuzz fuzzers
+
+This directory contains the upstream-owned fuzz targets used by OSS-Fuzz for
+Apache Airflow.
+
+## Security Model Alignment
+
+These fuzzers target code paths with **clear security boundaries** per
+Airflow's [security model](../airflow-core/docs/security/security_model.rst):
+
+- **DAG Serialization/Deserialization**: Used by Scheduler and API Server with
+  schema validation. Input comes from DAG parsing and caching.
+- **Connection URI Parsing**: Used when creating/updating connections via API.
+
+We explicitly **avoid** fuzzing code paths in the "DAG author trust zone"
+where Airflow's policy is that DAG authors can execute arbitrary code.
+
+## What's here
+
+- `*_fuzz.py`: Atheris fuzz targets (packaged by OSS-Fuzz via `pyinstaller`).
+- `*.dict`: Optional libFuzzer dictionaries for structured inputs.
+- `*.options`: libFuzzer options (e.g. `max_len`) tuned per target.
+- `seed_corpus/<fuzzer>/...`: Small seed corpora that get zipped and uploaded 
to
+  OSS-Fuzz for each target.
+
+## Fuzzers
+
+| Fuzzer | Target | Security Boundary |
+|--------|--------|-------------------|
+| `serialized_dag_fuzz.py` | `DagSerialization.from_dict()` | Schema 
validation |
+| `connection_uri_fuzz.py` | `Connection._parse_from_uri()` | API input 
validation |
+
+## Supported engines / sanitizers (Python constraints)
+
+Airflow is fuzzed as a **Python** OSS-Fuzz project. Practically, this means:
+
+- **Fuzzing engine**: `libfuzzer` (Atheris). Other engines (AFL/honggfuzz) are
+  not typically used/supported for Python targets in OSS-Fuzz.
+- **Sanitizers**: `address`, `undefined`, `coverage`, `introspector` are the
+  relevant modes. **MSan (`memory`) is not supported** for Python OSS-Fuzz
+  projects.
+
+## Running locally with OSS-Fuzz helper
+
+From a checkout of `google/oss-fuzz`:
+
+```bash
+# Build + basic validation:
+python3 infra/helper.py build_fuzzers --clean --sanitizer address airflow 
/path/to/airflow
+python3 infra/helper.py check_build --sanitizer address airflow
+
+# Coverage build + validation:
+python3 infra/helper.py build_fuzzers --clean --sanitizer coverage airflow 
/path/to/airflow
+python3 infra/helper.py check_build --sanitizer coverage airflow
+```
+
+## Running locally without OSS-Fuzz

Review Comment:
   Could you please make it follows the standard `pyproject.toml` approach ? 
For now this project somewhat depends on the way how local virtualenv packages 
are installed and this fuzzer should be generally "auto-installable" and 
"auto-runnable" using modern Python tools.
   
   So:
   
   * it should be a separate python distribution (similarly to what we do in 
`shared` dir it should have "Do not upload" classifiers
   * it should be added to our workspace (in the main `pyproject.toml`)
   * it should have a dependency on both `apache-airflow-core` and 
`apache-airflow-providers-standard` distributions.
   * it should have `atheris>=NN` (some reasonable version) as dependency
   * the end gaoal should be that it should be runnable (including automated 
venv setup `uv run ./serialized_dag_fuzz -max-total-time=40` -> and those 
should be instructions for local execution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to