naruto-lgtm commented on code in PR #68662:
URL: https://github.com/apache/airflow/pull/68662#discussion_r3442013741
##########
airflow-core/src/airflow/serialization/serialized_objects.py:
##########
@@ -658,14 +659,18 @@ def deserialize(cls, encoded_var: Any) -> Any:
args = deser["args"]
kwargs = deser["kwargs"]
del deser
+ # ``exc_cls_name`` comes from the serialized payload, so resolve
it without handing an
+ # attacker-controlled string to ``import_string`` -- that would
execute the named
+ # module's top-level code before any validation. The encode side
only ever emits
+ # ``airflow.*`` AirflowException subclasses for AIRFLOW_EXC_SER
and builtin
+ # KeyError/AttributeError for BASE_EXC_SER, so reject anything
outside those up front,
+ # mirroring the import-path guards in the
timetable/window/wait-policy decoders.
if type_ == DAT.AIRFLOW_EXC_SER:
+ if not exc_cls_name.startswith("airflow."):
Review Comment:
You're right, the prefix is no defense -- airflow.evil is trivial, and it's
worse than that: because the base AirflowException.serialize() is inherited,
the encode side accepts any AirflowException subclass, so the import path is
open-ended (provider and user-defined exceptions included). A prefix check
would both miss airflow.evil and break legitimate custom exceptions. I've
dropped that commit and reverted to just the post-import issubclass backstop.
More on the broader question in the PR thread.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]