> I agree that we should only publish VEXes for the currently supported
version (i.e. usually the most recent one).

Do you know if there is a way to "unpublish" VEX's ? (mark them not
maintained).
Because that is what will effectively happen, we will have to start
publishing new VEX's for the latest version and make sure that all the
consumers of previous VEX know it's not maintained any more.

Since looking through somebody else's code is error-prone and time
> consuming, I would restrict the generation of VEX statements to the
> following two situation:
>
> 1. The CVE is in a dependency that is only used by your project, not
> your project's dependencies. In this case you can directly read the CVE
> (or VDR document) and identify if you use the affected feature in your
> code.
>

As long as they are published in a way that we can discover them
automatically - which is not the case currently, but hopefully transparency
exchange API will make it possible, I guess also there will be a long tail
of projects that do not have VEX's or VDRS or SBOMS for years to come, so
the question is what we do with those dependencies? But I am afraid we are
a few years from even a small percentage of our deps to support it. And if
our users are regulators, ask us - what do we say?  We need to have a good
answer prepared.


> 2. The CVE is in a transitive dependency. In this case you wait until
> all your **direct** dependencies potentially affected publish a VEX
> statement. Only then you publish a statement based on what your
> dependencies say in their VEX records.
>

Same - we need some discoverability and a way to say "no idea" if an
automated way of discovering those does not work or the dependency does not
publish those.


>
> Note the the utility of VEXes varies greatly between ecosystems. In the
> case of Python applications users download the application and its
> dependencies directly from PyPI. Unless I am mistaken, if there is a
> vulnerability in one of your dependencies, you don't need to make a new
> AirFlow release, users can easily upgrade their system themselves.
> VEX-es might be only useful to those that create an Airflow Docker image
> to know if they need to regenerate it. They will probably do it anyway.
>

It's not that simple -  It very much depends. All variations are possible
and that's what makes it complex because you need to sometimes perform
deeper analysis, Sometimes your application or library introduces
upper-binding that effectively makes it impossible to upgrade. Sometimes
this is implied by upper-binding or other limitations coming from our
dependencies or dependencies of our dependencies.

Good example is a Werkzeug dependency (that we did such analysis on):

* Airflow 2 uses Werkzeug - transitively through several of our dependencies
* For quite a while there is a known and serious vulnerability in Werkzeug (
https://github.com/apache/airflow/discussions/44865 - CVE-2024-49767) in
version 2.2.3 we have - fixed in version 3.0.0 . We believe it does not
affect us as we do not seem to use that feature that affects us (uploading
files) - but we are not sure, because this dependency is used by about 5 or
6 other dependencies of ours - not directly by us - and we have not done
complete analysis of those - that would require a lot of digging and
analysing 3rd-party code to that we have no expertise in. So our answer is
"we think we are not affected, but we are not really sure - if you find an
exploitable path, let us know, please by providing a reproducible report".
* Up until ~ last year several of our dependencies did not allow us to
upgrade - they were upper-bound. This has improved over time and many of
them released versions that have no such limitation
* But the one remaining - Connexion 2 only updated to the new Werkzeug 3 in
Connexion 3 - and they heavily changed the architecture of Connexion 3 - so
that we cannot upgrade easily. Actually - we cannot upgrade at all. We
spend (particularly me) weeks of effort to do that, 2 interns from Major
League Hacking trying to solve the problems and make upgrade happen - and
we even had a green PR after 4 months - that no-one in the right mind would
accept, because it was band-aid-over-band-aid-over-band-aiid. We worked
together with Connexion maintainers and they helped us, but they would not
release Connexion 2 with Werkzeug 3 support as they stopped maintaining it
for good (actually they are new maintainers who took over from old ones,
and they only exclusively worked on rewriting Connexion 3 to use ASGI
instead of WSGI and they know and care almost nothing about Connexion 2.
The old maintainers are gone with the wind (of change).
* We are dropping Connexion in Airflow 3 - in favour of Fast API- part of
the reason is the security issue. That requires several months of effort
from about.3 people working on it pretty much full time.

Our users (happened multiple times) DEMAND from us to remove that
dependency, because it is shown in their scans as critical. And it's not
the image. The Werkzeug dependency is installed every time Airflow is
installed - version 2.3.0 that is vulnerable and we cannot do anything
about it - until Airflow 3.

So - in short, even being able to answer and triage this issue and see what
we can tell the users was literally months of effort, POC that failed, and
the end result is still "won't fix, we are likely not affected". So
essentially it's none of that: "resolved, resolved_with_pedigree,
exploitable, in_triage, false_positive, not_affected". It's something of
"we really don't know, but what we know is that we will not fix it in
Airflow 2".  If we knew of an exploitable way, we could likely monkey-patch
it, so if someone reports it to us that would be the remediation, but we
have no knowledge about any exploit.

This is but one example. And of course it's not an often case - but  you
can see that we have a chain of dependencies and pretty complex
interactions between them and deep analysis and literally weeks of effort
might be needed to just be able to say "anything" about such an issue.

This is ONE example. only. We have 180 direct and 700 indirect
dependencies. Also there are other variations. Some of our dependencies
have limits to versions of Python and platforms they are supported on.
(this is pretty common actually) - which means that in some cases our users
can upgrade dependency on Python 3.10, but not on Python 3.8. And this
might be only known after deep analysis of the chain of dependencies,
because this might come from transitive ways how dependencies are resolved.

What I am trying to say is that expecting us to say "not affected" or any
other "firm" statement often requires a lot of effort to just find the
right answer.


* If something like CVE-2022-1471[2] happens, the Jackson team will
> create a "not_affected" VEX record and Log4j Core will pass the same
> result to Kafka. Note that this will not generate a lot of additional
> work for the Jackson team, since they provided a human-readable version
> anyway[3]. It could even generate less work, if users consume the VEX file.
>


> * If something like CVE-2022-38752[4] happens, the Jackson team will
> create an upgrade recommendation for those that parse untrusted YAML.
> Log4j Core will generate a "not_affected" VEX record (we only use YAML
> to parse a config file) and yet again Kafka does not need a new release.
> Again, Jackson had to deal with a user question about it[5], so a VEX
> statement will not generate a lot of additional work.
>
>
Yes, If we can automatically discover vex's of all our deps, that use
transitive dep, and all of them say "not affected", fine. But this will not
happen in such cases. Even if Connexion 3 says "Affected" and we do not use
that particular functionality, we are not automatically affected as well -
because we might use only a subset of the functionality of the transient
dependency ourselves. In the example above. Connexion 2 would be "affected"
and following your logic, we would have to also say "affected", even if we
believe we are not (but have no proof of that -- this is next to impossible
to have proof of not being affected in this case).

That means that Airflow 2 would have "affected" Critical Werkzeug
vulnerability - even if we currently believe we are not - and nobody would
be able to  install Airflow if they want to be "CRA compliant". Airflow 3
will be out in 3 months. So everyone would have to wait for it. We have
literally no other actions we can do currently. We could take the risk of
course and say "not affected". But we really do not know. So what it
basically means it will be perpetually "in triage" for us. I see no other
way.

And yes that's one example only, but It's all but guaranteed it will
continue to happen as more CVEs will be discovered - especially for old
versions of Airflow. I really hope we can only limit ourselves to publish
only "last version" VEXes - that would help us immensely to at least
attempt to realistically assess those.



> Summarizing: let us restrict VEX-es to the current version of our
> projects and publish them only when the CVE has been analyzed by all our
> "suppliers". We should be open to external companies helping us with the
> analysis through PRs, but let us not do any additional work ourselves.
> Of course there are exceptions: when Log5Shell happens, we can do the
> entire analysis ourselves.
>

Yep. I would love to see the users who have Airflow 2.7 to ask for VEX for
it only to get 2.10.5 one (being voted now) and getting the answers
"upgrade to latest version if you want to have current VEX". And I really
love the idea - because that's what we are telling them now. So If we
really can use VEX's to stimulate that and have a way to say "Those VEXes
we published for all past Airflow versions are not maintained any more and
we don't care any more for that - upgrade".  That's my wet dream as a
maintainer. But I am afraid we have to work with the regulators and our
users to make sure that they have the same expectations - but I seriously
doubt it to be honest. I hope I am wrong :)


>
>

Reply via email to