TL;DR; I think we can close the pinning story (for now at least ):

I just merged yesterday the change that introduced requirements
per-version. Rather than having single requirements.txt in the main
directory, we have now requirements folder and requirements-python3.7.txt
and requirements-python3.6.txt files (they are slightly different) I am
backporting it to 1.10 (there the differences are much bigger 2.7/3.5/3.6)
but it will work the same way.

Just to summarize how it works (it is all described in CONTRIBUTING.rst
<http://CONTRIBUTING.rsthttps://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#airflow-dependencies>
and Breeze commands).

   - We have 'breeze generate-requirements' command (and a separate bash
   script doing the same) where the requirements can be generated from the
   latest images. So whenever someone updates setup.py they will have to
   regenerate those (CI build will fail if that's not done and will provide a
   helpful hint on how to fix it). You must specify python version when
   generating the requirements.
   - At the time of generation (locally), the requirements will be bumped
   to the latest version matching the setup.py constraints -. after generating
   the requirements, our CI tests will test if the updated requirements work
   with all the tests.
   - We have a  CRON job that always bumps requirements to the latest
   versions matching setyp.py constraints and will test if they can install.
   That will be an early warning in case of breaking changes.
   - We have an easy mechanism to install airflow in a consistent way -
   described in the documentation including:
   pip install apache-airflow[gcp]==1.10.10 \
       --constraint
   
https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.6.txt

 For me - that's all we need for now from the pinning point of view.  We
might also want to remove current conflicting installs (I will do it while
finishing prod image).

J.


On Sat, Mar 28, 2020 at 8:30 AM Jarek Potiuk <jarek.pot...@polidea.com>
wrote:

> Yep. Second half 2020. We need to cleanup some inconsistencies before we
> move to it :).
>
> J.
>
>
> On Sat, Mar 28, 2020 at 8:28 AM Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
>> The resolving will become more strict in the future:
>>
>> http://pyfound.blogspot.com/2020/03/new-pip-resolver-to-roll-out-this-year.html
>>
>> Cheers, Fokko
>>
>> Op wo 25 mrt. 2020 om 15:40 schreef Jarek Potiuk <
>> jarek.pot...@polidea.com>:
>>
>> > OK. I think that one does not need voting then. I will proceed with my
>> PR
>> > for that :).
>> >
>> > J.
>> >
>> > On Wed, Mar 25, 2020 at 3:38 PM Daniel Imberman <
>> daniel.imber...@gmail.com
>> > >
>> > wrote:
>> >
>> > > Agreed. Kind of a “best we can do” considering the current nature of
>> > > python.
>> > > On Mar 24, 2020, 2:45 PM -0700, Driesprong, Fokko
>> <fo...@driesprong.frl
>> > >,
>> > > wrote:
>> > > > Yes, I'd be in favor of not having two packages, and just pinning
>> the
>> > > > versions then. In this case, all the versions will be pinned, so if
>> a
>> > > user
>> > > > wants to install a newer version of elastic, they have to do it
>> > > explicitly.
>> > > > For Java, you have nice packages that will check if you break any
>> > public
>> > > > API, but for Python this is impossible :'(
>> > > >
>> > > > Cheers, Fokko
>> > > >
>> > > >
>> > > > Op di 24 mrt. 2020 om 11:11 schreef Jarek Potiuk <
>> > > jarek.pot...@polidea.com>:
>> > > >
>> > > > > And yet another update: - after seeing how it works I will remove
>> > > > > requirement generation from pre-commit - now that it needs to be
>> > > > > generated separately for different versions of python it's a bit
>> too
>> > > > > much overhead (you'd need to have more images downloaded for
>> > different
>> > > > > python versions for pre-commit). Instaad I will add breeze
>> commands
>> > to
>> > > > > re-ggenerate the requirements (and bash scripts if you do not use
>> > > > > breeze), and anyone changing setup.py will have to do it
>> (otherwise
>> > CI
>> > > > > builds will fail). I think this workflow will be great to keep our
>> > > > > requirements up-to-date and have a stable installation method.
>> > > > >
>> > > > > J.
>> > > > >
>> > > > >
>> > > > > On Mon, Mar 23, 2020 at 5:54 PM Jarek Potiuk <
>> > jarek.pot...@polidea.com
>> > > >
>> > > > > wrote:
>> > > > > >
>> > > > > > Update - It seems that we won't need the -pinned version
>> > eventually.
>> > > I
>> > > > > realized that we need to have slightly different requirements for
>> > > different
>> > > > > python versions.
>> > > > > >
>> > > > > > I just added PR for that:
>> > > https://github.com/apache/airflow/pull/7841
>> > > > > >
>> > > > > > I also found out (during production image exercise) that we can
>> > > install
>> > > > > airflow predictably in a very simple way (once we release the
>> > > requirements
>> > > > > in 1.10.10):
>> > > > > >
>> > > > > > pip install apache-airflow[gcp]==1.10.10 --constraint
>> > > > >
>> > >
>> >
>> https://raw.githubusercontent.com/apache/airflow/1.10.10/requirements/requirements-python3.7.txt
>> > > > > >
>> > > > > > I think this is simple enough to be used as installation
>> method. I
>> > > added
>> > > > > it to the documentation and I think I am ok with dropping -pinned
>> > > package
>> > > > > altogether.
>> > > > > >
>> > > > > > J.
>> > > > > >
>> > > > > >
>> > > > > > On Sun, Mar 22, 2020 at 10:15 AM Jarek Potiuk <
>> > > jarek.pot...@polidea.com>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > Yesterday we had another master breakage - this time from
>> > > elasticsearch
>> > > > > releasing MINOR version 7.6 breaking our builds (not it was MINOR
>> > > version
>> > > > > so should be compatible .... it was not for us). I fixed it
>> quickly
>> > > > > yesterday by limiting it to < 7.6 but for me - this is quite clear
>> > that
>> > > > > trying to rely on SemVer being followed by others is a futile
>> effort
>> > > (at
>> > > > > least in python's world).
>> > > > > > >
>> > > > > > > The theory is nice, but it breaks in practice. And it's not
>> > really
>> > > a
>> > > > > fault of the library maintainers. It's simply sometimes not so
>> easy
>> > to
>> > > see
>> > > > > how your APIs are used - and in Python, you cannot prevent using
>> > stuff
>> > > that
>> > > > > you think is an internal detail. This is what happened in
>> > elasticsearch
>> > > > > case yesterday - apparently, our plugin was using an "internal"
>> API
>> > > > > unknowingly and some parameters from that API were dropped during
>> > > > > refactoring of elasticsearch library.
>> > > > > > >
>> > > > > > > My observation (it's anecdotal though) is that the COVID-19
>> > > situation
>> > > > > made people have more time, fewer distractions, fewer things to
>> do,
>> > > and we
>> > > > > have higher frequency of OSS packages being released recently so
>> we
>> > > should
>> > > > > protect a bit from more often breakages.
>> > > > > > >
>> > > > > > > I think learning from yesterday is:
>> > > > > > >
>> > > > > > > * we should merge the requirements.txt solution quickly to
>> > prevent
>> > > > > further breakages (I am reading and testing it now) - I think
>> > everyone
>> > > > > agrees it's good to have it
>> > > > > > > * I think we can continue discussing whether
>> > apache-airflow-pinned
>> > > > > package should be released or not. I can leave the code building
>> the
>> > > > > package but we can decide about it after some more discussion
>> > > > > > >
>> > > > > > > Does it sound good?
>> > > > > > >
>> > > > > > > J
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Fri, Mar 20, 2020 at 2:47 PM Jarek Potiuk <
>> > > jarek.pot...@polidea.com>
>> > > > > wrote:
>> > > > > > > >
>> > > > > > > > And rebased it right now and fixed automated requirements
>> > update.
>> > > > > > > >
>> > > > > > > > On Fri, Mar 20, 2020 at 2:28 PM Jarek Potiuk <
>> > > jarek.pot...@polidea.com>
>> > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > Ah BTW. I just noticed that for some reason I pasted an
>> old
>> > PR
>> > > > > earlier in the thread :(.
>> > > > > > > > > This is the one with requirements.txt I am talking about:
>> > > > > https://github.com/apache/airflow/pull/7730
>> > > > > > > > >
>> > > > > > > > > On Fri, Mar 20, 2020 at 2:26 PM Jarek Potiuk <
>> > > > > jarek.pot...@polidea.com> wrote:
>> > > > > > > > > >
>> > > > > > > > > > Nope. Not blocking. I can work with my branch just
>> > > requirements.txt
>> > > > > is enough for that :)
>> > > > > > > > > >
>> > > > > > > > > > I think the problem with semver is that it is loosely
>> > > followed - we
>> > > > > had a number of breakages in the past with minor version upgrades
>> :(.
>> > > > > > > > > >
>> > > > > > > > > > J.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Fri, Mar 20, 2020 at 1:27 PM Kaxil Naik <
>> > > kaxiln...@gmail.com>
>> > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for the detailed explanation Jarek.
>> > > > > > > > > > >
>> > > > > > > > > > > How about we have an upper limit for all our
>> > dependencies,
>> > > example
>> > > > > instead
>> > > > > > > > > > > of "google-cloud-storage>=1.16", we have
>> > > > > "google-cloud-storage>=1.16,<2.0" ?
>> > > > > > > > > > >
>> > > > > > > > > > > If a dependency breaks compatibility in minor
>> versions,
>> > we
>> > > can't do
>> > > > > > > > > > > anything about it but if they follow SemVer, we
>> should be
>> > > safe and
>> > > > > the
>> > > > > > > > > > > first-time installers would have a non-breaking
>> package.
>> > > WDYT?
>> > > > > > > > > > >
>> > > > > > > > > > > Btw I hope this is not blocking you in building a
>> > > production image
>> > > > > as I
>> > > > > > > > > > > think requirements.txt is solving that? Please let me
>> > know
>> > > if it is
>> > > > > > > > > > > blocking.
>> > > > > > > > > > >
>> > > > > > > > > > > PS: I am also just dumping my ideas to solve this
>> issue.
>> > > Love to
>> > > > > hear what
>> > > > > > > > > > > others think too.
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > Kaxil
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Thu, Mar 19, 2020 at 2:43 PM Jarek Potiuk <
>> > > > > jarek.pot...@polidea.com>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > I think we have similar understanding. But let me
>> just
>> > > clarify
>> > > > > because I
>> > > > > > > > > > > > think we think about we think about solving two
>> > > different problems
>> > > > > > > > > > > > My proposal is not solving all problems with
>> > > dependencies - quite
>> > > > > the
>> > > > > > > > > > > > contrary, I want to solve just one specific
>> > > "repeatability"
>> > > > > problem - read
>> > > > > > > > > > > > on :)..
>> > > > > > > > > > > >
>> > > > > > > > > > > > 1. A potential source of confusion: using "-pinned"
>> for
>> > > > > installation but
>> > > > > > > > > > > > > using "non-pinned" for DAG development.
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > This could be confusing indeed - but they are the
>> same
>> > > in fact -
>> > > > > > > > > > > > just deps might be different over time.
>> > > > > > > > > > > >
>> > > > > > > > > > > > 2. Most of the users would still try to install
>> > > > > "apache-airflow" package
>> > > > > > > > > > > > > that might have been broken for example because
>> of a
>> > > > > dependency
>> > > > > > > > > > > > release,
>> > > > > > > > > > > > > either way, we would still have to suggest them to
>> > use
>> > > > > "pinned"
>> > > > > > > > > > > > version
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > True. I thought we might describe it in the README
>> and
>> > > make it
>> > > > > prominently
>> > > > > > > > > > > > explained. Usually people look at the readme in PyPI
>> > > when they are
>> > > > > > > > > > > > installing
>> > > > > > > > > > > > stuff and it does not work:
>> > > > > https://pypi.org/project/apache-airflow/.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Also - we could of course explain how to use
>> > > requirements.txt
>> > > > > from the
>> > > > > > > > > > > > released
>> > > > > > > > > > > > version when they are installing it. That would be
>> an
>> > > extra
>> > > > > friction point
>> > > > > > > > > > > > though
>> > > > > > > > > > > > and maybe having "always installable" version of
>> > airflow
>> > > is a
>> > > > > better
>> > > > > > > > > > > > choice.
>> > > > > > > > > > > >
>> > > > > > > > > > > > 3. If they install "pinned" version, it is no
>> longer a
>> > > library
>> > > > > again,
>> > > > > > > > > > > > > that is users won't be able to use new NumPy
>> release
>> > or
>> > > > > matplotlib for
>> > > > > > > > > > > > > example. In which case we are just circling back
>> to
>> > > the same
>> > > > > problem,
>> > > > > > > > > > > > > "either we risk broken package" while releasing
>> or we
>> > > risk
>> > > > > potentially
>> > > > > > > > > > > > > incompatible versions.
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Yep. But maybe it's just a question of naming. Maybe
>> > > even we
>> > > > > could name
>> > > > > > > > > > > > this package differently to indicate that this
>> version
>> > > is a way
>> > > > > to quickly
>> > > > > > > > > > > > install
>> > > > > > > > > > > > airflow but not to do any serious development with
>> it.
>> > > > > > > > > > > >
>> > > > > > > > > > > > So speaking about THE problem I want to solve with
>> the
>> > > > > > > > > > > > requirements.txt and apache-airflow-pinned package:
>> > > > > > > > > > > >
>> > > > > > > > > > > > I really only want to solve "first-time-user"
>> > experience
>> > > here -
>> > > > > nothing
>> > > > > > > > > > > > more. I
>> > > > > > > > > > > > definitely do not want to replace the current
>> > > installation method
>> > > > > for
>> > > > > > > > > > > > experienced
>> > > > > > > > > > > > users - for them using --constraint
>> requirements.txt is
>> > > exactly
>> > > > > what they
>> > > > > > > > > > > > need.
>> > > > > > > > > > > > The only problem I am trying to solve with that is
>> > > > > "repeatability" of
>> > > > > > > > > > > > installation.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Maybe "apache-airflow-quickinstall" or something
>> like
>> > > that would
>> > > > > be better
>> > > > > > > > > > > > than "apache-airflow-pinned" or
>> > > > > "apache-airflow-repeatable-install" or
>> > > > > > > > > > > > something like that. I think about it as a
>> "flavour" of
>> > > ariflow
>> > > > > rather than
>> > > > > > > > > > > > anything else. I even originally implemented it as
>> > > [pinned] extra
>> > > > > where I
>> > > > > > > > > > > > pinned all requirements. Unfortunately I found that
>> if
>> > > you have
>> > > > > > > > > > > > main requirement without limits, adding the same
>> > > requirement as
>> > > > > extra with
>> > > > > > > > > > > > == does not make it pinned :(. That was my original
>> > plan.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Btw I have been on "we should have pinned
>> dependency"
>> > > camp as
>> > > > > Airflow
>> > > > > > > > > > > > > should definitely install without breaking since
>> > day-1
>> > > but I
>> > > > > think a
>> > > > > > > > > > > > > separate "-pinned" package won't solve that issue.
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Ah yeah we went the same route. I do not think we
>> can
>> > > solve the
>> > > > > > > > > > > > "library vs. app" problem easily. This is a bit of
>> > > > > "eat-and-have-cake"
>> > > > > > > > > > > > at the same time. I know people have problems
>> > > > > > > > > > > > with conflicting dependencies when they are trying
>> to
>> > > install
>> > > > > libraries
>> > > > > > > > > > > > with different requirements. And I am not even
>> trying
>> > to
>> > > solve
>> > > > > that
>> > > > > > > > > > > > problem now. Not even close. This requires some
>> other
>> > > solution
>> > > > > > > > > > > > (for example separate virtualenvs with different
>> > > dependencies
>> > > > > > > > > > > > build from wheels on per-task basis). But that's
>> > > something much
>> > > > > further
>> > > > > > > > > > > > in the future (if at all).
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > WDYT? Also please do let me know if I have
>> > > misunderstood
>> > > > > something
>> > > > > > > > > > > > > (definitely possible :D).
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > Kaxil
>> > > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > >
>> > > > > > > > > > Jarek Potiuk
>> > > > > > > > > > Polidea | Principal Software Engineer
>> > > > > > > > > >
>> > > > > > > > > > M: +48 660 796 129
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > >
>> > > > > > > > > Jarek Potiuk
>> > > > > > > > > Polidea | Principal Software Engineer
>> > > > > > > > >
>> > > > > > > > > M: +48 660 796 129
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > >
>> > > > > > > > Jarek Potiuk
>> > > > > > > > Polidea | Principal Software Engineer
>> > > > > > > >
>> > > > > > > > M: +48 660 796 129
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > > Jarek Potiuk
>> > > > > > > Polidea | Principal Software Engineer
>> > > > > > >
>> > > > > > > M: +48 660 796 129
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > >
>> > > > > > Jarek Potiuk
>> > > > > > Polidea | Principal Software Engineer
>> > > > > >
>> > > > > > M: +48 660 796 129
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Jarek Potiuk
>> > > > > Polidea | Principal Software Engineer
>> > > > >
>> > > > > M: +48 660 796 129
>> > > > >
>> > >
>> >
>> >
>> > --
>> >
>> > Jarek Potiuk
>> > Polidea <https://www.polidea.com/> | Principal Software Engineer
>> >
>> > M: +48 660 796 129 <+48660796129>
>> > [image: Polidea] <https://www.polidea.com/>
>> >
>>
>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to