[DISCUSS] What should we cherry-pick ?

Jarek Potiuk Mon, 23 Jun 2025 03:42:44 -0700

I wanted to start a discussion on "things that we cherry-pick" (to vX_Z
branch).


I think there are different opinions on what kind of changes should be
cherry-picked and it might be a good idea to agree on a common approach.

I think (following the comment of Ash here)
https://github.com/apache/airflow/pull/51992#issuecomment-2995632849 that
we can use a very simplistic and (I'd say) dogmatic approach "only
cherry-pick bug fixes. Full stop". But I believe (and past experience from
a lot of cherry-picking that I've been doing - multiple times helping to
bring past branches to be green and spending countless hours on it, that it
should be a bit more nuanced.

I would love to see what others think, but from my experience those are the
things that we **should** cherry-pick:


1) bug-fixes (of course)
2) doc changes (when they are improvements or filling gaps)
3) dev tool changes (every time we did not, it resulted in hours of my time
when things were breaking and I tried to reconcile it)
4) results of automated refactorings that have very low risks (in the areas
that are likely to have cherry-picks)

t) - is non-controversial I think

2) - is also relatively non-controversial and very low risk and gives our
users a chance to get better docs earlier (even today for example I cherry
picked this one: https://github.com/apache/airflow/pull/52068 - because one
of my friends who tries to learn Airflow 3 pinged me that
"ConfiuguringRuff" link that we have in 3.0.2 leads to 404 NOT found

3) - it had always bitten us if we stopped cherry-picking dev tool changes.
The thing is that external dependencies change all the time and we are
continuously catching up with those, also we improve, speed up and simplify
the tooling - and often things that worked when branch was cut, does not
work today - countless, countless hours lost in one or two branches when we
stopped doing it - I think even once or twice I had to just copy over most
(but not all) the code from main to the branch and commit one single
"catch-up dev tooling with main" big change

4) Is likely most controversial - example here:
https://github.com/apache/airflow/pull/51992/ - those are the kind of
(really small) changes that are done in "active" area (i.e. area that had
and will have a lot of cherry-picks anyway, but they are done with
automated refactoring - like renaming variables and such. This introduces
clarity and readability, so this is good we are doing them. But if we do
not cherry-pick them and then we cherry-pick any change that touches the
same code, this lead to a conflict. Conflicts are frustrating, especially
those kinds - you never know what you should do - should you "merge" this
naming change with your change? or should you leave the original namiing,
or should you try to find the past commit that changed it and cherry-pick
it as well? This paired with the fact that we are using cherry-picker that
allows to cherry-pick stuff very quickly, automatically and painlessly when
there are no conflicts, make me think that yes - we should cherry -pick
those changes proactively as a service to those contributors who will
follow up with their cherry-picking. It's just "good service" and helping
others who will come after you.

That's how my definition of "what we should cherry-pick" is...

I wonder what others think about it ?

J.

[DISCUSS] What should we cherry-pick ?

Reply via email to