[this post is available online at https://s.apache.org/1upl4 ]

by Jarek Potiuk

This post is about the Apache Software Foundation's Security process and 
security mindset of the Apache Software project’s PMC put to the best use in 
practice. From this post you can learn why security practices we apply at our 
projects are important and how they work when they are applied correctly and 
when the right security-driven mindset is applied by the PMCs but also how 
important it is for the users of the Apache Software Foundation projects to 
keep their software updated - including latest security fixes.

The idea of this article was triggered by a recent blog post of the security 
researcher Ian Caroll that has earned USD 13.000 on bug bounties by simply 
following up the results of Apache Security process applied by the Apache 
Airflow PMC. This saved quite a few businesses a lot of trouble, but it was 
only possible due to the foundations laid down by the ASF and the PMC of the 
project.

Here is what Ian Caroll has to say about it: "This issue was a great example of 
how ASF's transparent way of fixing and disclosing vulnerabilities worked to 
protect users of their software, and gave many organizations a wake-up call on 
ensuring they upgrade and protect their open-source software."

Apache Airflow is one of the most common orchestration software used in the 
industry currently, and due to its nature, it sounds like an important vector 
of attack - if you run it internally in your company, you are likely to 
interact with pretty much all your systems, and if you manage to break in 
through Airflow, it might cascade into as many systems you connect to. 
Therefore the Apache Airflow PMC takes security very seriously. So seriously 
that we have the whole discussion panel about Apache Airflow Security at the 
Airflow Summit that is coming soon - July 8-16th.

This post's main point is to show how important it is to follow the security 
best practices for all the software lifecycle and how important it is to think 
about it at every step of building and releasing the software (and beyond).

Let's start from the very beginning: making sure the code development process 
is secure. Like most of the ASF projects, the Apache Airflow project is 
developed in GitHub and together with a growing number of projects we use 
GitHub Actions to run continuous integration. There are a number of best 
practices and security hardening practices published by Github that you should 
follow when you run your CI with GitHub Actions, and we rigorously follow them, 
including monitoring of the "Security blog of GitHub" and following its 
advisories.

And we have not stopped there. We actively think and discuss the potential 
security threats and ways how - for example supply chain attacks can be 
performed on our project, and we share our findings at the discussion mailing 
lists of the ASF and introducing recommendations for all ASF projects to make 
use of the best practices. One of the results there is documenting the 
practices and sharing them at the bui...@apache.org. But we also raised a few 
security issues to GitHub and as a result of that (at least that's the feedback 
we got from GitHub) they implemented some improvements that we apply in 
practice. The recent example of that is a change implemented by GitHub to allow 
control of permissions of the GitHub Token used during the CI build which 
resulted in this PR. Few months ago, we raised concern that having the blanket 
"write" permission is quite dangerous, and GitHub responded and implemented the 
change, which allowed us to limit the scope of tokens used for our builds and 
increase protection against a wide range of attacks - with the supply-chain 
attacks being recently the most prominent ones, leading to ransomware threats 
and millions of dollars paid to hackers. 

This is where the security mindset for the Apache Airflow PMC starts with and 
this lays the foundation for the next steps where the Apache Software 
Foundation takes a crucial role in - releasing the software and monitoring for 
security vulnerabilities. The ASF has a rather well established process for 
disclosing and following up with security vulnerabilities for the ASF projects. 
One that is very straightforward and simple to follow for everyone involved - 
starting from security researchers, who raise those issues, going through the 
voluntary (!) security team of the ASF that has to handle (from the upcoming 
annual report) 387 reports of possible vulnerabilities spanned across 95 of the 
top level ASF projects, which led to 155 CVEs (Common Vulnerabilities and 
Exposures) assigned, and end up with the PMC that has to handle solving the 
issues and follow up with reporting. Heck, ASF even introduced an internal 
portal to report and keep track of all the CVEs as well as report the yearly 
security summary report and video.

This process is very clear about responsible disclosure and publishing the 
vulnerabilities, the way how security researchers, the ASF security team and 
PMC can collaborate when security is discovered. Quite a recent experience 
there was discovering and announcing CVE-2021-29621: User enumeration in 
database authentication in Flask-AppBuilder. This issue was reported to the ASF 
- following the process - by Dolev Farhi he responsibly disclosed it together 
with proof-of-concept reproducible scenario that allowed us to quickly verify 
that the issue exists and (more importantly) that allowed us to verify that the 
issue is fixed when we fixed it. 

At the end of the process this is the message we got from Dolev: "Truly enjoyed 
working with you. Thanks so much for your help in bringing this to closure and 
making Airflow what it is."

The CVE was an interesting one because it was not an issue with the Airflow 
code, but it was introduced by a dependency of Airflow - Flask-AppBuilder. 
Fortunately the process is built in the way that we can involve and collaborate 
with other projects in solving it, and we got excellent support from Daniel 
Gaspar. We tried and tested the fix locally, provided it to Daniel which let 
Daniel quickly implement it and release a new version of Flask AppBuilder 
fixing it. This was also important for the Apache Superset project (Daniel is a 
PMC there as well) which also uses Flask-AppBuilder and suffered from the same 
vulnerability. This shows how security is a distributed issue and how much 
cooperation is important and how much a good security process should embrace 
it. I truly enjoyed cooperation with Daniel, and Dolev as we helped to test 
release candidate of Flask AppBuilder. Later on, when the CVE was published, we 
announced it following the regular announcement process.

Here is what Daniel has to say about it: "A great example of multiple open 
source projects working together, elevating each other to higher quality. The 
whole is greater than the sum of the parts. Got a clear report with a proposed 
fix, reproducible steps all backed by the ASF security process, it was a breeze 
to fix and release."

This leads to the most important point. We can do only as much as we can when 
it comes to developing and releasing our software. But then it's up to our 
users to upgrade to the latest versions. If they don't, they remain vulnerable. 
This was the actual reason for the blog post I mentioned initially - despite 
announcing a CVE-2020-17526 and releasing a fixed version a long time ago, many 
of our users did not follow the announcements and did not upgrade to the latest 
version of Airflow. I must stress here the importance of this step - as long as 
our users do not upgrade to fixed versions, there is not much we can do to help 
them. It's all in our users' hands! This time it ended up with just USD 13.000 
paid to Ian in the form of bounties, because Ian is a responsible security 
researcher (so called "white hat"). But imagine some bad characters doing the 
same thing Ian did.

Of course we understand that this might sometimes be difficult to migrate to 
newer versions of a software, but here we also have another solution that we 
applied last year, and one that might seem surprising at first, but makes 
perfect sense when you look at the consequences. Consistent versioning and 
release support predictability. When we announced Airflow 2.0 last year, there 
was a small but important change we introduced - full support for Semantic 
Versioning which we follow rigorously since. We also published a predictable 
version lifecycle. Why is this important ? Because the users might be pretty 
sure that they can safely upgrade "patchlevel" version of Airflow when it gets 
released without even thinking about potential migration problems. Also when 
you release the "feature" - minor version of Airflow, we promise it is 
backwards-compatible and even if the migration process might be a bit longer, 
they can apply it without worrying about spending a lot of time for the 
migration of their DAGs (DAGs are the users workflow definitions that some of 
our customers have many thousands of as their entire data processing is 
orchestrated by Airflow). 

We also publish (and will continue to) the support schedule for our major 
releases, so that the users can be prepared and plan migration to new major 
releases in advance. As with all software we sometimes will implement 
backwards-incompatible changes which will cause our users to spend more time on 
migrations. Those old releases will stop receiving security fixes at some date 
and the best you can do as a user is to migrate to the supported version before 
the date!

Which leads to the last and most important point in this article. If you are a 
diligent reader and look at the announcement I mentioned above for 
CVE-2021-29621, you will see that the fix for that is only released for Airflow 
2 series. Why? Because Airflow 1.10 just reached its end-of-life on June 17th 
2021. When we released Airflow 2, half a year ago, we agreed in the community 
that we will only support Airflow 1.10 with critical/security fixes for 6 
months. And we did - for example the CVE-2020-17526 has been addressed in the 
Airflow 1.10.14. 

But this time is over now. This is the first security vulnerability that we 
addressed only for Airflow 2. If you are still using Airflow 1.10 - you are on 
your own now. You are no longer protected by the security process of the ASF, 
the security team of ASF and airflow PMC. What's even more - security 
researchers who raise the issues, even if they find it, might not be eager to 
responsibly disclose it, knowing also that the issue will not be fixed anyway. 
When you read about the next ransomware attack and millions of dollars paid, 
think if you would like one day your company to face this kind of dilemma. Even 
if it costs time and money to keep your software updated, preventing this kind 
of problem is far cheaper than dealing with the consequences of such an attack.

Upgrade NOW! to the latest release of Airflow 2 and keep on doing it for the 
future releases!

Be sure to join us at Airflow Summit online 8-16 July 
https://airflowsummit.org/ --registration is free and open to all.

# # #

Jarek Potiuk started to work on the Apache Airflow project in September 2018. 
He became an Apache Airflow committer in April 2019 and a member of the Apache 
Airflow Project Management Committee (PMC) in October 2019. He was elected an 
ASF Member in April 2021. He is an Apache project mentor in Outreachy and 
Google Summer of Code and was a mentor in Google Season of Docs. Jarek is an 
independent Open Source Contributor and Advisor and always keen on making it 
easier for people with different backgrounds to join OSS projects.

= = =

"Success at Apache" is a monthly blog series that focuses on the processes 
behind why the ASF "just works" 
https://blogs.apache.org/foundation/category/SuccessAtApache 

= = =

NOTE: you are receiving this message because you are subscribed to the 
announce@apache.org distribution list. To unsubscribe, send email from the 
recipient account to announce-unsubscr...@apache.org with the word 
"Unsubscribe" in the subject line.

Reply via email to