Hello everyone,

TL;DR; I wanted to propose switching to PyPI Trusted Publishing workflow
for Airflow releases with Github Actions being the trusted publisher.

A bit of background:

In the security team and also in the security committee of ASF w have been
discussing ways how we can plugin into Trusted Publishing [1] feature of
PyPI to publish our artifacts more securely in PyPI and after some initial
back/forth discussions (I also discussed it with PyPI Security Engineer,
Mike Fiedler at PyCon in US) we settled on a proposal to simply use Github
Trusted Publishing (another option was  to make ASF a trusted publisher -
and it still might happen, but not in a short term, and we can always
change it in the future.

What is the model of authentication used today?

Currently when release managers upload releases to PyPI, they have to
authenticate with their API key. Those API keys are generated by the
individuals using their personal credentials. All release managers are
added to Apache Airflow organisation and this is what determines if they
are able to upload releases to PyPI. The keys are long-living, and even if
you use 2FA (it's mandated in our organisation). So theoretically, someone
stealing RM API keys could upload a malicious release to PyPI.

What does it mean to use trusted publishing for PyPI for uploads?

Trusted Publisher, changes the authentication model - and delegates the
authentication of such upload (and only that) to a "Trusted Publisher"
entity. In the proposal, such Trusted Publisher is GitHub and we will be
able to use GitHub Actions to upload the packages - for security it can be
done from a separate "Github Actions Environment" [2] and only selected
maintainers (i.e. release managers) of Airflow might be added to have
access to that environment and up to 6 people can be designated to be
reviewers and workflows run there will need to be approved by one of those
6 people. Only workflows from the Airflow repository and designated
environment will be able to upload packages.  This has the added benefit
that you might disable "self-review" and any upload to PyPI will require at
least 2 people - including one reviewer. Also we can audit and see the
history of all such uploads in Github Action logs. We will disable the
possibility of "manual" uploads once we get it working and tested.

Are we going to change how packages are built?

No. We will continue to build them using the current process. Release
managers will built the packages locally on their machines and upload them
to the ASF SVN in the same way as today (small difference is that we will
also have to upload RC/BETA packages to SVN - we are currently not doing
it). The publishing workflow will be just pulling the artifacts from ASF
SVN and publishing them via GitHub Action. They will not be built in Github
Actions. Thanks to binary reproducibility we have, we will be able to track
and verify provenance of those - all the artifacts in SVN and PyPI will be
binary identical.

I started a discussion some time ago about it in the ASF
Security-discuss/builds discussion lists [3] and there are no problems with
that approach. We might even be able to create a reusable action for other
projects in the ASF (but only if it will make sense). We might also need a
few workflows / ways to manage exceptional cases (yanking, removing
releases, creating new projects). We will work it out while implementing
it. I don't expect surprises there - this is a solved problem. There are
already multiple thousands of projects using the Trusted Publishing
workflow in PyPI, it's available for more than 6 months already.

Future work:

Also ASF is working - separately on a new (codename) "Artifact Distribution
Platform" where we will be able to build and distribute the packages on ASF
infrastructure. This does not change the approach for Trusted Publishing -
we will still be able to use Github Actions to pull the artifacts from the
"official" ASF URL and publish it in PyPI.

Even more future work:

In the even more distant future ASF might become a trusted publisher on its
own, but it's a bit complicated and it also might not be needed. There is
the draft PEP-770 [4] being worked on by the packaging team which will
allow to attach digital signatures to PyPI artifacts, which will mean that
the signatures and checksums generated by release manager (and in the
future by the Artifact Distribution Platform) - and it will allow our users
not only to verify binary reproducibility but also provenance of the
packages (i.e. signatures, indicating that the packages were generated by
the official means of the Apache Software Foundation). So regardless of
which Trusted Publisher is used to upload the packages, the attestations of
provenance will provide independent verification on who created the
packages - effectively decoupling package creation from publishing.

I am looking for feedback on that one, I've been following it and
discussing details for quite some time (including discussing it at PyCon
with THE Python security team), so I am happy to answer any questions
anyone might have.

If there will be (as I suspect) - no objections after any explanations
needed, I will run a lazy consensus and get it implemented.

J.


[1] Trusted Publishers: https://docs.pypi.org/trusted-publishers/
[2] Github Actions Environments:
https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment
[3] Discussion about Trusted Publishing in security-discuss@a.o:
https://lists.apache.org/thread/byszw8vhq1cwtc9xhq0v9q5mkc5mz09n
[4] PEP-740 Index support for Digital Attestations:
https://discuss.python.org/t/pep-740-index-support-for-digital-attestations/44498

Reply via email to