In short (summarising to those who do not want to read my long mail) as I
see this working for us long term:

* we need to have a way to say VEX are not maintained for old versions
* we will publish SBOMs for "main" development so that our users can
automatically see if problems are fixed in the upcoming release
* for all CVES we will have "in triage" state - we can automate that
* we will only publish "known" state for those dependencies that we
actually analysed and know the state(there will be just few of those)
* we should be able to say "no idea, help us to find out if this is
`affected` state but you will never hear "not affected" from us"

If we can do that, then yes - that's workable at scale, but I am afraid
it's not the expectation of those who want those VEXs :)

J.


On Thu, Feb 6, 2025 at 1:15 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> > I agree that we should only publish VEXes for the currently supported
> version (i.e. usually the most recent one).
>
> Do you know if there is a way to "unpublish" VEX's ? (mark them not
> maintained).
> Because that is what will effectively happen, we will have to start
> publishing new VEX's for the latest version and make sure that all the
> consumers of previous VEX know it's not maintained any more.
>
> Since looking through somebody else's code is error-prone and time
>> consuming, I would restrict the generation of VEX statements to the
>> following two situation:
>>
>> 1. The CVE is in a dependency that is only used by your project, not
>> your project's dependencies. In this case you can directly read the CVE
>> (or VDR document) and identify if you use the affected feature in your
>> code.
>>
>
> As long as they are published in a way that we can discover them
> automatically - which is not the case currently, but hopefully transparency
> exchange API will make it possible, I guess also there will be a long tail
> of projects that do not have VEX's or VDRS or SBOMS for years to come, so
> the question is what we do with those dependencies? But I am afraid we are
> a few years from even a small percentage of our deps to support it. And if
> our users are regulators, ask us - what do we say?  We need to have a good
> answer prepared.
>
>
>> 2. The CVE is in a transitive dependency. In this case you wait until
>> all your **direct** dependencies potentially affected publish a VEX
>> statement. Only then you publish a statement based on what your
>> dependencies say in their VEX records.
>>
>
> Same - we need some discoverability and a way to say "no idea" if an
> automated way of discovering those does not work or the dependency does not
> publish those.
>
>
>>
>> Note the the utility of VEXes varies greatly between ecosystems. In the
>> case of Python applications users download the application and its
>> dependencies directly from PyPI. Unless I am mistaken, if there is a
>> vulnerability in one of your dependencies, you don't need to make a new
>> AirFlow release, users can easily upgrade their system themselves.
>> VEX-es might be only useful to those that create an Airflow Docker image
>> to know if they need to regenerate it. They will probably do it anyway.
>>
>
> It's not that simple -  It very much depends. All variations are possible
> and that's what makes it complex because you need to sometimes perform
> deeper analysis, Sometimes your application or library introduces
> upper-binding that effectively makes it impossible to upgrade. Sometimes
> this is implied by upper-binding or other limitations coming from our
> dependencies or dependencies of our dependencies.
>
> Good example is a Werkzeug dependency (that we did such analysis on):
>
> * Airflow 2 uses Werkzeug - transitively through several of our
> dependencies
> * For quite a while there is a known and serious vulnerability in Werkzeug
> (https://github.com/apache/airflow/discussions/44865 - CVE-2024-49767) in
> version 2.2.3 we have - fixed in version 3.0.0 . We believe it does not
> affect us as we do not seem to use that feature that affects us (uploading
> files) - but we are not sure, because this dependency is used by about 5 or
> 6 other dependencies of ours - not directly by us - and we have not done
> complete analysis of those - that would require a lot of digging and
> analysing 3rd-party code to that we have no expertise in. So our answer is
> "we think we are not affected, but we are not really sure - if you find an
> exploitable path, let us know, please by providing a reproducible report".
> * Up until ~ last year several of our dependencies did not allow us to
> upgrade - they were upper-bound. This has improved over time and many of
> them released versions that have no such limitation
> * But the one remaining - Connexion 2 only updated to the new Werkzeug 3
> in Connexion 3 - and they heavily changed the architecture of Connexion 3 -
> so that we cannot upgrade easily. Actually - we cannot upgrade at all. We
> spend (particularly me) weeks of effort to do that, 2 interns from Major
> League Hacking trying to solve the problems and make upgrade happen - and
> we even had a green PR after 4 months - that no-one in the right mind would
> accept, because it was band-aid-over-band-aid-over-band-aiid. We worked
> together with Connexion maintainers and they helped us, but they would not
> release Connexion 2 with Werkzeug 3 support as they stopped maintaining it
> for good (actually they are new maintainers who took over from old ones,
> and they only exclusively worked on rewriting Connexion 3 to use ASGI
> instead of WSGI and they know and care almost nothing about Connexion 2.
> The old maintainers are gone with the wind (of change).
> * We are dropping Connexion in Airflow 3 - in favour of Fast API- part of
> the reason is the security issue. That requires several months of effort
> from about.3 people working on it pretty much full time.
>
> Our users (happened multiple times) DEMAND from us to remove that
> dependency, because it is shown in their scans as critical. And it's not
> the image. The Werkzeug dependency is installed every time Airflow is
> installed - version 2.3.0 that is vulnerable and we cannot do anything
> about it - until Airflow 3.
>
> So - in short, even being able to answer and triage this issue and see
> what we can tell the users was literally months of effort, POC that failed,
> and the end result is still "won't fix, we are likely not affected". So
> essentially it's none of that: "resolved, resolved_with_pedigree,
> exploitable, in_triage, false_positive, not_affected". It's something of
> "we really don't know, but what we know is that we will not fix it in
> Airflow 2".  If we knew of an exploitable way, we could likely monkey-patch
> it, so if someone reports it to us that would be the remediation, but we
> have no knowledge about any exploit.
>
> This is but one example. And of course it's not an often case - but  you
> can see that we have a chain of dependencies and pretty complex
> interactions between them and deep analysis and literally weeks of effort
> might be needed to just be able to say "anything" about such an issue.
>
> This is ONE example. only. We have 180 direct and 700 indirect
> dependencies. Also there are other variations. Some of our dependencies
> have limits to versions of Python and platforms they are supported on.
> (this is pretty common actually) - which means that in some cases our users
> can upgrade dependency on Python 3.10, but not on Python 3.8. And this
> might be only known after deep analysis of the chain of dependencies,
> because this might come from transitive ways how dependencies are resolved.
>
> What I am trying to say is that expecting us to say "not affected" or any
> other "firm" statement often requires a lot of effort to just find the
> right answer.
>
>
> * If something like CVE-2022-1471[2] happens, the Jackson team will
>> create a "not_affected" VEX record and Log4j Core will pass the same
>> result to Kafka. Note that this will not generate a lot of additional
>> work for the Jackson team, since they provided a human-readable version
>> anyway[3]. It could even generate less work, if users consume the VEX
>> file.
>>
>
>
>> * If something like CVE-2022-38752[4] happens, the Jackson team will
>> create an upgrade recommendation for those that parse untrusted YAML.
>> Log4j Core will generate a "not_affected" VEX record (we only use YAML
>> to parse a config file) and yet again Kafka does not need a new release.
>> Again, Jackson had to deal with a user question about it[5], so a VEX
>> statement will not generate a lot of additional work.
>>
>>
> Yes, If we can automatically discover vex's of all our deps, that use
> transitive dep, and all of them say "not affected", fine. But this will not
> happen in such cases. Even if Connexion 3 says "Affected" and we do not use
> that particular functionality, we are not automatically affected as well -
> because we might use only a subset of the functionality of the transient
> dependency ourselves. In the example above. Connexion 2 would be "affected"
> and following your logic, we would have to also say "affected", even if we
> believe we are not (but have no proof of that -- this is next to impossible
> to have proof of not being affected in this case).
>
> That means that Airflow 2 would have "affected" Critical Werkzeug
> vulnerability - even if we currently believe we are not - and nobody would
> be able to  install Airflow if they want to be "CRA compliant". Airflow 3
> will be out in 3 months. So everyone would have to wait for it. We have
> literally no other actions we can do currently. We could take the risk of
> course and say "not affected". But we really do not know. So what it
> basically means it will be perpetually "in triage" for us. I see no other
> way.
>
> And yes that's one example only, but It's all but guaranteed it will
> continue to happen as more CVEs will be discovered - especially for old
> versions of Airflow. I really hope we can only limit ourselves to publish
> only "last version" VEXes - that would help us immensely to at least
> attempt to realistically assess those.
>
>
>
>> Summarizing: let us restrict VEX-es to the current version of our
>> projects and publish them only when the CVE has been analyzed by all our
>> "suppliers". We should be open to external companies helping us with the
>> analysis through PRs, but let us not do any additional work ourselves.
>> Of course there are exceptions: when Log5Shell happens, we can do the
>> entire analysis ourselves.
>>
>
> Yep. I would love to see the users who have Airflow 2.7 to ask for VEX for
> it only to get 2.10.5 one (being voted now) and getting the answers
> "upgrade to latest version if you want to have current VEX". And I really
> love the idea - because that's what we are telling them now. So If we
> really can use VEX's to stimulate that and have a way to say "Those VEXes
> we published for all past Airflow versions are not maintained any more and
> we don't care any more for that - upgrade".  That's my wet dream as a
> maintainer. But I am afraid we have to work with the regulators and our
> users to make sure that they have the same expectations - but I seriously
> doubt it to be honest. I hope I am wrong :)
>
>
>>
>>

Reply via email to