Although I think I come down on the side against pinning, my reasons are different.
For the two (or more) people who have expressed concern about it would pip's "Constraint Files" help: https://pip.pypa.io/en/stable/user_guide/#constraints-files For example, you could add "flask-appbuilder==1.11.1" in to this file, specify it with `pip install -c constraints.txt apache-airflow` and then whenever pip attempted to install _any version of FAB it would use the exact version from the constraints file. I don't buy the argument about pinning being a requirement for graduation from Incubation fwiw - it's an unavoidable artefact of the open-source world we develop in. https://libraries.io/ offers a (free?) service that will monitor apps dependencies for being out of date, might be better than writing our own solution. Pip has for a while now supported a way of saying "this dep is for py2.7 only": > Since version 6.0, pip also supports specifiers containing environment > markers like so: > > SomeProject ==5.4 ; python_version < '2.7' > SomeProject; sys_platform == 'win32' Ash > On 8 Oct 2018, at 07:58, George Leslie-Waksman <waks...@gmail.com> wrote: > > As a member of a team that will also have really big problems if > Airflow pins all requirements (for reasons similar to those already > stated), I would like to add a very strong -1 to the idea of pinning > them for all installations. > > In a number of situation on our end, to avoid similar problems with > CI, we use `pip-compile` from pip-tools (also mentioned): > https://pypi.org/project/pip-tools/ > > I would like to suggest, a middle ground of: > > - Have the installation continue to use unpinned (`>=`) with minimum > necessary requirements set > - Include a pip-compiled requirements file (`requirements-ci.txt`?) > that is used by CI > - - If we need, there can be one file for each incompatible python version > - Append a watermark (hash of `setup.py` requirements?) to the > compiled requirements file > - Add a CI check that the watermark and original match to ensure no > drift since last compile > > I am happy to do much of the work for this, if it can help avoid > pinning all of the depends at the installation level. > > --George Leslie-Waksman > > On Sun, Oct 7, 2018 at 1:26 PM Maxime Beauchemin > <maximebeauche...@gmail.com> wrote: >> >> pip-tools can definitely help here to ship a reference [locked] >> `requirements.txt` that can be used in [all or part of] the CI. It's >> actually kind of important to get CI to fail when a new [backward >> incompatible] lib comes out and break things while allowing version ranges. >> >> I think there may be challenges around pip-tools and projects that run in >> both python2.7 and python3.6. You sometimes need to have 2 requirements.txt >> lock files. >> >> Max >> >> On Sun, Oct 7, 2018 at 5:06 AM Jarek Potiuk <jarek.pot...@polidea.com> >> wrote: >> >>> It's a nice one :). However I think when/if we go to pinned dependencies >>> the way poetry/pip-tools do it, this will be suddenly lot-less useful It >>> will be very easy to track dependency changes (they will be always >>> committed as a change in the .lock file or requirements.txt) and if someone >>> has a problem while upgrading a dependency (always consciously, never >>> accidentally) it will simply fail during CI build and the change won't get >>> merged/won't break the builds of others in the first place :). >>> >>> J. >>> >>> On Sun, Oct 7, 2018 at 6:26 AM Deng Xiaodong <xd.den...@gmail.com> wrote: >>> >>>> Hi folks, >>>> >>>> On top of this discussion, I was thinking we should have the ability to >>>> quickly monitor dependency release as well. Previously, it happened for a >>>> few times that CI kept failing for no reason and eventually turned out it >>>> was due to dependency release. But it took us some time, sometimes a few >>>> days, to realise the failure was because of dependency release. >>>> >>>> To partially address this, I tried to develop a mini tool to help us >>> check >>>> the latest release of Python packages & the release date-time on PyPi. >>> So, >>>> by comparing it with our CI failure history, we may be able to >>> troubleshoot >>>> faster. >>>> >>>> Output Sample (ordered by upload time in desc order): >>>> Latest Version Upload Time >>>> Package Name >>>> awscli 1.16.28 >>> 2018-10-05T23:12:45 >>>> botocore 1.12.18 2018-10-05T23:12:39 >>>> promise 2.2.1 >>> 2018-10-04T22:04:18 >>>> Keras 2.2.4 >>> 2018-10-03T20:59:39 >>>> bleach 3.0.0 >>> 2018-10-03T16:54:27 >>>> Flask-AppBuilder 1.12.0 2018-10-03T09:03:48 >>>> ... ... >>>> >>>> It's a minimal tool (not perfect yet but working). I have hosted this >>> tool >>>> at https://github.com/XD-DENG/pypi-release-query. >>>> >>>> >>>> XD >>>> >>>> On Sat, Oct 6, 2018 at 12:25 AM Jarek Potiuk <jarek.pot...@polidea.com> >>>> wrote: >>>> >>>>> Hello Erik, >>>>> >>>>> I understand your concern. It's a hard one to solve in general (i.e. >>>>> dependency-hell). It looks like in this case you treat Airflow as >>>>> 'library', where for some other people it might be more like 'end >>>> product'. >>>>> If you look at the "pinning" philosophy - the "pin everything" is good >>>> for >>>>> end products, but not good for libraries. In the case you have Airflow >>> is >>>>> treated as a bit of both. And it's perfectly valid case at that (with >>>>> custom python DAGs being central concept for Airflow). >>>>> However, I think it's not as bad as you think when it comes to exact >>>>> pinning. >>>>> >>>>> I believe - a bit counter-intuitively - that tools like >>> pip-tools/poetry >>>>> with exact pinning result in having your dependencies upgraded more >>>> often, >>>>> rather than less - especially in complex systems where dependency-hell >>>>> creeps-in. If you look at Airflow's setup.py now - It's a bit scary to >>>> make >>>>> any change to it. There is a chance it will blow at your face if you >>>> change >>>>> it. You never know why there is 0.3 < ver < 1.0 - and if you change it, >>>>> whether it will cause chain reaction of conflicts that will ruin your >>>> work >>>>> day. >>>>> >>>>> On the contrary - if you change it to exact pinning in >>>>> .lock/requirements.txt file (poetry/pip-tools) and have much simpler >>> (and >>>>> commented) exclusion/avoidance rules in your .in/.tml file, the whole >>>> setup >>>>> might be much easier to maintain and upgrade. Every time you prepare >>> for >>>>> release (or even once in a while for master) one person might >>> consciously >>>>> attempt to upgrade all dependencies to latest ones. It should be almost >>>> as >>>>> easy as letting poetry/pip-tools help with figuring out what are the >>>> latest >>>>> set of dependencies that will work without conflicts. It should be >>> rather >>>>> straightforward (I've done it in the past for fairly complex systems). >>>> What >>>>> those tools enable is - doing single-shot upgrade of all dependencies. >>>>> After doing it you can make sure that all tests work fine (and fix any >>>>> problems that result from it). And then you test it thoroughly before >>> you >>>>> make final release. You can do it in separate PR - with automated >>> testing >>>>> in Travis which means that you are not disturbing work of others >>>>> (compilation/building + unit tests are guaranteed to work before you >>>> merge >>>>> it) while doing it. It's all conscious rather than accidental. Nice >>> side >>>>> effect of that is that with every release you can actually "catch-up" >>>> with >>>>> latest stable versions of many libraries in one go. It's better than >>>>> waiting until someone deliberately upgrades to newer version (and the >>>> rest >>>>> remain terribly out-dated as is the case for Airflow now). >>>>> >>>>> So a bit counterintuitively I think tools like pip-tools/poetry help >>> you >>>> to >>>>> catch up faster in many cases. That is at least my experience so far. >>>>> >>>>> Additionally, Airflow is an open system - if you have very specific >>> needs >>>>> for requirements, you might actually - in the very same way with >>>>> pip-tools/poetry - upgrade all your dependencies in your local fork of >>>>> Airflow before someone else does it in master/release. Those tools kind >>>> of >>>>> democratise dependency management. It should be as easy as `pip-compile >>>>> --upgrade` or `poetry update` and you will get all the >>> "non-conflicting" >>>>> latest dependencies in your local fork (and poetry especially seems to >>> do >>>>> all the heavy lifting of figuring out which versions will work). You >>>> should >>>>> be able to test and publish it locally as your private package for >>> local >>>>> installations. You can even mark the specific dependency you want to >>> use >>>>> specific version and let pip-tools/poetry figure out exact versions of >>>>> other requirements. You can even make a PR with such upgrade eventually >>>> to >>>>> get it faster in master. You can even downgrade in case newer >>> dependency >>>>> causes problems for you in similar way. Guided by the tools, it's much >>>>> faster than figuring the versions out by yourself. >>>>> >>>>> As long as we have simple way of managing it and document how to >>>>> upgrade/downgrade dependencies in your own fork, and mention how to >>>> locally >>>>> release Airflow as a package, I think your case could be covered even >>>>> better than now. What do you think ? >>>>> >>>>> J. >>>>> >>>>> On Fri, Oct 5, 2018 at 2:34 PM EKC (Erik Cederstrand) >>>>> <e...@novozymes.com.invalid> wrote: >>>>> >>>>>> For us, exact pinning of versions would be problematic. We have DAG >>>> code >>>>>> that shares direct and indirect dependencies with Airflow, e.g. lxml, >>>>>> requests, pyhive, future, thrift, tzlocal, psycopg2 and ldap3. If our >>>> DAG >>>>>> code for some reason needs a newer point release due to a bug that's >>>>> fixed, >>>>>> then we can't cleanly build a virtual environment containing the >>> fixed >>>>>> version. For us, it's already a problem that Airflow has quite strict >>>>> (and >>>>>> sometimes old) requirements in setup.py. >>>>>> >>>>>> Erik >>>>>> ________________________________ >>>>>> From: Jarek Potiuk <jarek.pot...@polidea.com> >>>>>> Sent: Friday, October 5, 2018 2:01:15 PM >>>>>> To: dev@airflow.incubator.apache.org >>>>>> Subject: Re: Pinning dependencies for Apache Airflow >>>>>> >>>>>> I think one solution to release approach is to check as part of >>>> automated >>>>>> Travis build if all requirements are pinned with == (even the deep >>>> ones) >>>>>> and fail the build in case they are not for ALL versions (including >>>>>> dev). And of course we should document the approach of >>>> releases/upgrades >>>>>> etc. If we do it all the time for development versions (which seems >>>> quite >>>>>> doable), then transitively all the releases will also have pinned >>>>> versions >>>>>> and they will never try to upgrade any of the dependencies. In poetry >>>>>> (similarly in pip-tools with .in file) it is done by having a .lock >>>> file >>>>>> that specifies exact versions of each package so it can be rather >>> easy >>>> to >>>>>> manage (so it's worth trying it out I think :D - seems a bit more >>>>>> friendly than pip-tools). >>>>>> >>>>>> There is a drawback - of course - with manually updating the module >>>> that >>>>>> you want, but I really see that as an advantage rather than drawback >>>>>> especially for users. This way you maintain the property that it will >>>>>> always install and work the same way no matter if you installed it >>>> today >>>>> or >>>>>> two months ago. I think the biggest drawback for maintainers is that >>>> you >>>>>> need some kind of monitoring of security vulnerabilities and cannot >>>> rely >>>>> on >>>>>> automated security upgrades. With >= requirements those security >>>> updates >>>>>> might happen automatically without anyone noticing, but to be honest >>> I >>>>>> don't think such upgrades are guaranteed even in current setup for >>> all >>>>>> security issues for all libraries anyway. >>>>>> >>>>>> Finding the need to upgrade because of security issues can be quite >>>>>> automated. Even now I noticed Github started to inform owners about >>>>>> potential security vulnerabilities in used libraries for their >>> project. >>>>>> Those notifications can be sent to devlist and turned into JIRA >>> issues >>>>>> followed bvy minor security-related releases (with only few library >>>>>> dependencies upgraded). >>>>>> >>>>>> I think it's even easier to automate it if you have pinned >>>> dependencies - >>>>>> because it's generally easy to find applicable vulnerabilities for >>>>> specific >>>>>> versions of libraries by static analysers - when you have >=, you >>> never >>>>>> know which version will be used until you actually perform the >>>>>> installation. >>>>>> >>>>>> There is one big advantage for maintainers for "pinned" case. Your >>>> users >>>>>> always have the same dependencies - so when issue is raised, you can >>>>>> reproduce it more easily. It's hard to know which version user has >>> (as >>>>> the >>>>>> user could install it month ago or yesterday) and even if you find >>> out >>>> by >>>>>> asking the user, you might not be able to reproduce the set of >>>>> requirements >>>>>> easily (simply because there are already newer versions of the >>>> libraries >>>>>> released and they are used automatically). You can ask the user to >>> run >>>>> pip >>>>>> --upgrade but that's dangerous and pretty lame ("check the latest >>>>> version - >>>>>> maybe it fixes your problem ? ") and sometimes not possible (e.g. >>>> someone >>>>>> has pre-built docker image with dependencies from few months ago and >>>>> cannot >>>>>> rebuild the image easily). >>>>>> >>>>>> J. >>>>>> >>>>>> On Fri, Oct 5, 2018 at 12:35 PM Ash Berlin-Taylor <a...@apache.org> >>>>> wrote: >>>>>> >>>>>>> One thing to point out here. >>>>>>> >>>>>>> Right now if you `pip install apache-airflow=1.10.0` in a clean >>>>>>> environment it will fail. >>>>>>> >>>>>>> This is because we pin flask-login to 0.2.1 but flask-appbuilder is >>>>> = >>>>>>> 1.11.1, so that pulls in 1.12.0 which requires flask-login >= 0.3. >>>>>>> >>>>>>> So I do think there is maybe something to be said about pinning for >>>>>>> releases. The down side to that is that if there are updates to a >>>>> module >>>>>>> that we want then we have to make a point release to let people get >>>> it >>>>>>> >>>>>>> Both methods have draw-backs >>>>>>> >>>>>>> -ash >>>>>>> >>>>>>>> On 4 Oct 2018, at 17:13, Arthur Wiedmer < >>> arthur.wied...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Jarek, >>>>>>>> >>>>>>>> I will +1 the discussion Dan is referring to and George's advice. >>>>>>>> >>>>>>>> I just want to double check we are talking about pinning in >>>>>>>> requirements.txt only. >>>>>>>> >>>>>>>> This offers the ability to >>>>>>>> pip install -r requirements.txt >>>>>>>> pip install --no-deps airflow >>>>>>>> For a guaranteed install which works. >>>>>>>> >>>>>>>> Several different requirement files can be provided for specific >>>> use >>>>>>> cases, >>>>>>>> like a stable dev one for instance for people wanting to work on >>>>>>> operators >>>>>>>> and non-core functions. >>>>>>>> >>>>>>>> However, I think we should proactively test in CI against >>> unpinned >>>>>>>> dependencies (though it might be a separate case in the matrix) , >>>> so >>>>>> that >>>>>>>> we get advance warning if possible that things will break. >>>>>>>> CI downtime is not a bad thing here, it actually caught a problem >>>> :) >>>>>>>> >>>>>>>> We should unpin as possible in setup.py to only maintain minimum >>>>>> required >>>>>>>> compatibility. The process of pinning in setup.py is extremely >>>>>>> detrimental >>>>>>>> when you have a large number of python libraries installed with >>>>>> different >>>>>>>> pinned versions. >>>>>>>> >>>>>>>> Best, >>>>>>>> Arthur >>>>>>>> >>>>>>>> On Thu, Oct 4, 2018 at 8:36 AM Dan Davydov >>>>>> <ddavy...@twitter.com.invalid >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Relevant discussion about this: >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-airflow%2Fpull%2F1809%23issuecomment-257502174&data=01%7C01%7CEKC%40novozymes.com%7Cd31403917b084e3615c208d62aba4c24%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=MM%2FoNwkPYR8UtBUczXLfZD2lCp7Ig%2BI%2FL2rFszcoJi8%3D&reserved=0 >>>>>>>>> >>>>>>>>> On Thu, Oct 4, 2018 at 11:25 AM Jarek Potiuk < >>>>>> jarek.pot...@polidea.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> TL;DR; A change is coming in the way how >>>> dependencies/requirements >>>>>> are >>>>>>>>>> specified for Apache Airflow - they will be fixed rather than >>>>>> flexible >>>>>>>>> (== >>>>>>>>>> rather than >=). >>>>>>>>>> >>>>>>>>>> This is follow up after Slack discussion we had with Ash and >>>> Kaxil >>>>> - >>>>>>>>>> summarising what we propose we'll do. >>>>>>>>>> >>>>>>>>>> *Problem:* >>>>>>>>>> During last few weeks we experienced quite a few downtimes of >>>>>> TravisCI >>>>>>>>>> builds (for all PRs/branches including master) as some of the >>>>>>> transitive >>>>>>>>>> dependencies were automatically upgraded. This because in a >>>> number >>>>> of >>>>>>>>>> dependencies we have >= rather than == dependencies. >>>>>>>>>> >>>>>>>>>> Whenever there is a new release of such dependency, it might >>>> cause >>>>>>> chain >>>>>>>>>> reaction with upgrade of transitive dependencies which might >>> get >>>>> into >>>>>>>>>> conflict. >>>>>>>>>> >>>>>>>>>> An example was Flask-AppBuilder vs flask-login transitive >>>>> dependency >>>>>>> with >>>>>>>>>> click. They started to conflict once AppBuilder has released >>>>> version >>>>>>>>>> 1.12.0. >>>>>>>>>> >>>>>>>>>> *Diagnosis:* >>>>>>>>>> Transitive dependencies with "flexible" versions (where >= is >>>> used >>>>>>>>> instead >>>>>>>>>> of ==) is a reason for "dependency hell". We will sooner or >>> later >>>>> hit >>>>>>>>> other >>>>>>>>>> cases where not fixed dependencies cause similar problems with >>>>> other >>>>>>>>>> transitive dependencies. We need to fix-pin them. This causes >>>>>> problems >>>>>>>>> for >>>>>>>>>> both - released versions (cause they stop to work!) and for >>>>>> development >>>>>>>>>> (cause they break master builds in TravisCI and prevent people >>>> from >>>>>>>>>> installing development environment from the scratch. >>>>>>>>>> >>>>>>>>>> *Solution:* >>>>>>>>>> >>>>>>>>>> - Following the old-but-good post >>>>>>>>>> >>>>>> >>>>> >>>> >>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnvie.com%2Fposts%2Fpin-your-packages%2F&data=01%7C01%7CEKC%40novozymes.com%7Cd31403917b084e3615c208d62aba4c24%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=PVE3S4mgki7L%2BcAe104o2cf68wRXolvYXRFmAyiX8gA%3D&reserved=0 >>>>>> we are going to fix the >>>>>>>>>> pinned >>>>>>>>>> dependencies to specific versions (so basically all >>>> dependencies >>>>>> are >>>>>>>>>> "fixed"). >>>>>>>>>> - We will introduce mechanism to be able to upgrade >>>> dependencies >>>>>> with >>>>>>>>>> pip-tools ( >>>>>> >>>>> >>>> >>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjazzband%2Fpip-tools&data=01%7C01%7CEKC%40novozymes.com%7Cd31403917b084e3615c208d62aba4c24%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=Kt9CjWrolvpjp7MwIR2nn8EIf9CW9HW02U7GVGyOXMo%3D&reserved=0 >>>>> ). >>>>>> We might also >>>>>>>>> take a >>>>>>>>>> look at pipenv: >>>>>> >>>>> >>>> >>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpipenv.readthedocs.io%2Fen%2Flatest%2F&data=01%7C01%7CEKC%40novozymes.com%7Cd31403917b084e3615c208d62aba4c24%7C43d5f49ee03a4d22a2285684196bb001%7C0&sdata=1tiY6pgX3IbRYC5W0HKr0ER2qMZ3GKYrwmWg%2BUo0tqs%3D&reserved=0 >>>>>>>>>> - People who would like to upgrade some dependencies for >>> their >>>>> PRs >>>>>>>>> will >>>>>>>>>> still be able to do it - but such upgrades will be in their >>> PR >>>>> thus >>>>>>>>> they >>>>>>>>>> will go through TravisCI tests and they will also have to be >>>>>>> specified >>>>>>>>>> with >>>>>>>>>> pinned fixed versions (==). This should be part of review >>>> process >>>>>> to >>>>>>>>>> make >>>>>>>>>> sure new/changed requirements are pinned. >>>>>>>>>> - In release process there will be a point where an upgrade >>>> will >>>>> be >>>>>>>>>> attempted for all requirements (using pip-tools) so that we >>> are >>>>> not >>>>>>>>>> stuck >>>>>>>>>> with older releases. This will be in controlled PR >>> environment >>>>>> where >>>>>>>>>> there >>>>>>>>>> will be time to fix all dependencies without impacting others >>>> and >>>>>>>>> likely >>>>>>>>>> enough time to "vet" such changes (this can be done for >>>>> alpha/beta >>>>>>>>>> releases >>>>>>>>>> for example). >>>>>>>>>> - As a side effect dependencies specification will become far >>>>>> simpler >>>>>>>>>> and straightforward. >>>>>>>>>> >>>>>>>>>> Happy to hear community comments to the proposal. I am happy to >>>>> take >>>>>> a >>>>>>>>> lead >>>>>>>>>> on that, open JIRA issue and implement if this is something >>>>> community >>>>>>> is >>>>>>>>>> happy with. >>>>>>>>>> >>>>>>>>>> J. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> *Jarek Potiuk, Principal Software Engineer* >>>>>>>>>> Mobile: +48 660 796 129 >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> *Jarek Potiuk, Principal Software Engineer* >>>>>> Mobile: +48 660 796 129 >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> *Jarek Potiuk, Principal Software Engineer* >>>>> Mobile: +48 660 796 129 >>>>> >>>> >>> >>> >>> -- >>> >>> *Jarek Potiuk, Principal Software Engineer* >>> Mobile: +48 660 796 129 >>>