Thank you, Ismaël. I did not know that Avro was not using semantic versioning either.
On Thu, Feb 13, 2020 at 9:44 AM Valentyn Tymofieiev <[email protected]> wrote: > Thank you, Ismaël. Good to know Avro doesn't follow semantic versioning. > Replied on the PR. > > On Thu, Feb 13, 2020 at 5:24 AM Ismaël Mejía <[email protected]> wrote: > >> For info Avro has published a new version 1.9.2.1 that fixes the issue: >> https://issues.apache.org/jira/browse/AVRO-2737 >> >> I just submitted a PR to make the dependency consistent with Avro >> versioning and >> verify that everything works as intended with the upgraded dependency on >> the >> python SDK. Can you PTAL? >> https://github.com/apache/beam/pull/10851 >> >> >> On Thu, Feb 13, 2020 at 9:39 AM Ismaël Mejía <[email protected]> wrote: >> >>> >>> > I can argue for not pinning and bounding with major version ranges. >>> This gives flexibility to users to mix other third party libraries that >>> share common dependencies with Beam. Our expectation is that dependencies >>> follow semantic versioning and do not introduce breaking changes unless >>> there is a major version change. A good example of this is Beam's >>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone >>> file. Latest version of the dependency is 2019.3, it is updated a few times >>> a year. Beam users do not have to update Beam just to be able to use a >>> later version of it since Beam does not pin it. >>> >>> Avro does not follow semantic versioning (the first number corresponds >>> to the version of the Avro binary format the release is compatible with, >>> the second correspond to the MAJOR and the third to the MINOR in semver), >>> so we should then fix the upper bound to 1.10.0 instead of 2.0.0 >>> considering that 1.10.x before the summer and it may contain breaking >>> changes. >>> >>> > There is also a middle ground, where we can pin certain dependencies >>> if we are not confident about their releases. And allow ranges for rest of >>> the dependencies. In general, we are currently following this practice. >>> >>> I see your point, like many things in software it is all about >>> tradeoffs, and it is good to find a middle ground, do we have a robust >>> reproducible release experience, or do we deal with the annoyance of doing >>> manual minor version upgrades. Choices choices... >>> >>> >>> >>> >>> On Thu, Feb 13, 2020 at 2:26 AM Ahmet Altay <[email protected]> wrote: >>> >>>> >>>> >>>> On Wed, Feb 12, 2020 at 12:54 PM Ismaël Mejía <[email protected]> >>>> wrote: >>>> >>>>> Independently of the bug in the dependency release the fact that the >>>>> Beam Python >>>>> SDK does not have pinned fixed dependency numbers is error-prone. We >>>>> may >>>>> continue to have this kind of problems until we fix this (with other >>>>> dependencies too). In the Java SDK we do not accept such type of >>>>> dynamic >>>>> dependency numbers and python should probably follow this practice to >>>>> avoid >>>>> issues like the present one. >>>>> >>>>> Why don't we just do: >>>>> >>>>> 'avro-python3==1.9.1', >>>>> >>>>> instead of the current: >>>>> >>>>> 'avro-python3>=1.8.1,!=1.9.2,<2.0.0; python_version >= "3.0"', >>>>> >>>> >>>> I agree this is error prone. Your argument for pinning makes sense and >>>> I agree with it. >>>> >>>> I can argue for not pinning and bounding with major version ranges. >>>> This gives flexibility to users to mix other third party libraries that >>>> share common dependencies with Beam. Our expectation is that dependencies >>>> follow semantic versioning and do not introduce breaking changes unless >>>> there is a major version change. A good example of this is Beam's >>>> dependency on "pytz>=2018.3". It is a simple wrapper around a time zone >>>> file. Latest version of the dependency is 2019.3, it is updated a few times >>>> a year. Beam users do not have to update Beam just to be able to use a >>>> later version of it since Beam does not pin it. >>>> >>>> There is also a middle ground, where we can pin certain dependencies if >>>> we are not confident about their releases. And allow ranges for rest of the >>>> dependencies. In general, we are currently following this practice. >>>> >>>> >>>>> >>>>> >>>>> On Wed, Feb 12, 2020 at 9:14 PM Ahmet Altay <[email protected]> wrote: >>>>> >>>>>> Related: we have dependencies on avro, avro-python3, and fastavro. >>>>>> fastavro supports both python 2 and 3. Could we reduce this dependency >>>>>> list >>>>>> and depend only on fastavro? If we need avro and avro-python3 for the >>>>>> purposes of testing only, we can move them to test only dependencies. >>>>>> >>>>>> +Chamikara Jayalath <[email protected]>, because I vaguely >>>>>> remember him working on this. >>>>>> >>>>>> The reason I am calling for this is the impact of bad dependency >>>>>> releases are high. All previously released Beam versions will be >>>>>> impacted. >>>>>> Reducing the dependency list will reduce the risk. >>>>>> >>>>>> Ahmet >>>>>> >>>>>> On Wed, Feb 12, 2020 at 12:02 PM Ahmet Altay <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Thank you Valentyn! >>>>>>> >>>>>>> On Wed, Feb 12, 2020 at 11:32 AM Valentyn Tymofieiev < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Yes, otherwise all Python tests will continue to fail until Avro >>>>>>>> comes up with a new release. Sent: >>>>>>>> https://github.com/apache/beam/pull/10844 >>>>>>>> >>>>>>>> On Wed, Feb 12, 2020 at 11:08 AM Ahmet Altay <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Should we update Beam's setup.py to skip this avro-python3 version? >>>>>>>>> >>>>>>>>> On Wed, Feb 12, 2020 at 10:57 AM Alan Krumholz < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> makes sense. I'll add this workaround for now. >>>>>>>>>> Thanks so much for your help! >>>>>>>>>> >>>>>>>>>> On Wed, Feb 12, 2020 at 10:33 AM Valentyn Tymofieiev < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Alan, Dataflow workers preinstall Beam SDK dependencies, >>>>>>>>>>> including (a working version) of avro-python3. So after reading >>>>>>>>>>> your email >>>>>>>>>>> once again, I think in your case you were not able to install Beam >>>>>>>>>>> SDK >>>>>>>>>>> locally. So a workaround for you would be to `pip install >>>>>>>>>>> avro-python3==1.9.1` or `pip install pycodestyle` before >>>>>>>>>>> installing Beam, >>>>>>>>>>> until AVRO-2737 is resolved. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Feb 12, 2020 at 10:21 AM Valentyn Tymofieiev < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Ah, there's already >>>>>>>>>>>> https://issues.apache.org/jira/browse/AVRO-2737 and it >>>>>>>>>>>> received attention. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:19 AM Valentyn Tymofieiev < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Opened https://issues.apache.org/jira/browse/AVRO-2738 >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Here's a short repro: >>>>>>>>>>>>>> >>>>>>>>>>>>>> :~$ docker run -it --entrypoint=/bin/bash python:3.7-stretch >>>>>>>>>>>>>> root@04b45a100d16:/# pip install avro-python3 >>>>>>>>>>>>>> Collecting avro-python3 >>>>>>>>>>>>>> Downloading avro-python3-1.9.2.tar.gz (37 kB) >>>>>>>>>>>>>> ERROR: Command errored out with exit status 1: >>>>>>>>>>>>>> command: /usr/local/bin/python -c 'import sys, >>>>>>>>>>>>>> setuptools, tokenize; sys.argv[0] = >>>>>>>>>>>>>> '"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"'; >>>>>>>>>>>>>> __file__='"'"'/tmp/pip-install-mmy4vspt/avro-python3/setup.py'"'"';f=getattr(tokenize, >>>>>>>>>>>>>> '"'"'open'"'"', >>>>>>>>>>>>>> open)(__file__);code=f.read().replace('"'"'\r\n'"'"', >>>>>>>>>>>>>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, >>>>>>>>>>>>>> '"'"'exec'"'"'))' >>>>>>>>>>>>>> egg_info --egg-base >>>>>>>>>>>>>> /tmp/pip-install-mmy4vspt/avro-python3/pip-egg-info >>>>>>>>>>>>>> cwd: /tmp/pip-install-mmy4vspt/avro-python3/ >>>>>>>>>>>>>> Complete output (5 lines): >>>>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>>>> File "<string>", line 1, in <module> >>>>>>>>>>>>>> File "/tmp/pip-install-mmy4vspt/avro-python3/setup.py", >>>>>>>>>>>>>> line 41, in <module> >>>>>>>>>>>>>> import pycodestyle >>>>>>>>>>>>>> ModuleNotFoundError: No module named 'pycodestyle' >>>>>>>>>>>>>> ---------------------------------------- >>>>>>>>>>>>>> ERROR: Command errored out with exit status 1: python >>>>>>>>>>>>>> setup.py egg_info Check the logs for full command output. >>>>>>>>>>>>>> root@04b45a100d16:/# >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:14 AM Valentyn Tymofieiev < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, it is a bug in the recent Avro release. We should >>>>>>>>>>>>>>> report it to the Avro maintainers. The workaround is to >>>>>>>>>>>>>>> downgrade >>>>>>>>>>>>>>> avro-python3 to 1.9.1, for example via requirements.txt. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 10:06 AM Steve Niemitz < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> avro-python3 1.9.2 was released on pypi 4 hours ago, and >>>>>>>>>>>>>>>> added pycodestyle as a dependency, probably related? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 1:03 PM Luke Cwik <[email protected]> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> +dev <[email protected]> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> There was recently an update to add autoformatting to the >>>>>>>>>>>>>>>>> Python SDK[1]. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm seeing this during testing of a PR as well. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 1: >>>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/448bb5c2d73fbd74eec7aacb5f28fa2f9d791784c2e53a2e3325627a%40%3Cdev.beam.apache.org%3E >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 9:57 AM Alan Krumholz < >>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Some more information for this as I still can't get to >>>>>>>>>>>>>>>>>> fix it.... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This job is triggered using the beam[gcp] python sdk from >>>>>>>>>>>>>>>>>> a KubeFlow Pipelines component which runs on top of docker >>>>>>>>>>>>>>>>>> image: >>>>>>>>>>>>>>>>>> tensorflow/tensorflow:1.13.1-py3 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I just checked and that image hasn't been updated >>>>>>>>>>>>>>>>>> recently. I also redeployed my pipeline to another (older) >>>>>>>>>>>>>>>>>> deployment of >>>>>>>>>>>>>>>>>> KFP and it gives me the same error (which tells me this >>>>>>>>>>>>>>>>>> isn't an internal >>>>>>>>>>>>>>>>>> KFP problem) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The exact same pipeline/code running on the exact same >>>>>>>>>>>>>>>>>> image has been running fine for days. Did anything changed >>>>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>>>> beam/dataflow side since yesterday morning? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks for your help! this is a production pipeline that >>>>>>>>>>>>>>>>>> is not running for us :( >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Feb 12, 2020 at 7:21 AM Alan Krumholz < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi, I have a scheduled daily job that I have been >>>>>>>>>>>>>>>>>>> running fine in dataflow for days now. >>>>>>>>>>>>>>>>>>> We haven't changed anything on this code but this >>>>>>>>>>>>>>>>>>> morning run failed (it couldn't even spin up the job) >>>>>>>>>>>>>>>>>>> The job submits a setup.py file (that also hasn't >>>>>>>>>>>>>>>>>>> changed) but maybe is causing the problem? (based on the >>>>>>>>>>>>>>>>>>> error I'm getting) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Anyone else having the same issue? or know how to fix it? >>>>>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> ERROR: Complete output from command python setup.py >>>>>>>>>>>>>>>>>>> egg_info: >>>>>>>>>>>>>>>>>>> 2 ERROR: Traceback (most recent call last): >>>>>>>>>>>>>>>>>>> 3 File "<string>", line 1, in <module> >>>>>>>>>>>>>>>>>>> 4 File >>>>>>>>>>>>>>>>>>> "/tmp/pip-install-42zyi89t/avro-python3/setup.py", line 41, >>>>>>>>>>>>>>>>>>> in <module> >>>>>>>>>>>>>>>>>>> 5 import pycodestyle >>>>>>>>>>>>>>>>>>> 6 ImportError: No module named 'pycodestyle' >>>>>>>>>>>>>>>>>>> 7 ---------------------------------------- >>>>>>>>>>>>>>>>>>> 8ERROR: Command "python setup.py egg_info" failed with >>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-42zyi89t/avro-python3/ >>>>>>>>>>>>>>>>>>> 9 ERROR: Complete output from command python setup.py >>>>>>>>>>>>>>>>>>> egg_info: >>>>>>>>>>>>>>>>>>> 10 ERROR: Traceback (most recent call last): >>>>>>>>>>>>>>>>>>> 11 File "<string>", line 1, in <module> >>>>>>>>>>>>>>>>>>> 12 File >>>>>>>>>>>>>>>>>>> "/tmp/pip-install-wrqytf9a/avro-python3/setup.py", line 41, >>>>>>>>>>>>>>>>>>> in <module> >>>>>>>>>>>>>>>>>>> 13 import pycodestyle >>>>>>>>>>>>>>>>>>> 14 ImportError: No module named 'pycodestyle' >>>>>>>>>>>>>>>>>>> 15 ---------------------------------------- >>>>>>>>>>>>>>>>>>> 16ERROR: Command "python setup.py egg_info" failed with >>>>>>>>>>>>>>>>>>> error code 1 in /tmp/pip-install-wrqytf9a/avro-python3/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>
