naruto-lgtm commented on code in PR #68662:
URL: https://github.com/apache/airflow/pull/68662#discussion_r3437029022
##########
airflow-core/src/airflow/serialization/serialized_objects.py:
##########
@@ -662,6 +662,12 @@ def deserialize(cls, encoded_var: Any) -> Any:
exc_cls = import_string(exc_cls_name)
else:
exc_cls = import_string(f"builtins.{exc_cls_name}")
+ # ``exc_cls_name`` comes from the serialized payload. A tampered
blob can name any
+ # importable callable (``os.system``, ``builtins.exec``, ...)
which would then be
+ # invoked with attacker-supplied args. Only accept genuine
exception types, matching
+ # the import-path guards the timetable/window/wait-policy decoders
already apply.
+ if not (isinstance(exc_cls, type) and issubclass(exc_cls,
BaseException)):
Review Comment:
Good points, both fair.
On the legit case: the encode side is narrow. `AIRFLOW_EXC_SER` only fires
for `AirflowException` subclasses with a `serialize()` method (e.g.
`AirflowRescheduleException` round-tripping through a serialized sensor
reschedule), and the name is always the full `airflow.*` import path.
`BASE_EXC_SER` only fires for `KeyError`/`AttributeError`, encoded as a bare
builtin name. So nothing legitimate ever names a non-`airflow.` module or a
non-exception.
On the after-import placement: you're right, validating post-import only
really covers the builtins branch, and the `AIRFLOW_EXC_SER` branch still
imports the module (running its top-level code) before the check. I moved the
guard ahead of the import to match what `decode_window`/`decode_wait_policy`
do: reject anything not under `airflow.` up front, and resolve the builtins
branch via `getattr(builtins, name)` instead of `import_string`, so no
attacker-named module is imported at all. The `issubclass(..., BaseException)`
check stays as a backstop.
One tradeoff worth calling out: the `airflow.` prefix means a user-defined
`AirflowException` subclass living outside the `airflow` namespace would no
longer round-trip. That mirrors how the timetable/window/wait-policy decoders
already treat non-core paths (core-only unless registered), so it seemed like
the consistent line to draw, but happy to loosen it if you'd rather keep custom
exceptions working.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]