potiuk commented on PR #38753:
URL: https://github.com/apache/airflow/pull/38753#issuecomment-2039233516
It's been very specific - an I am not sure if it is constrantly reproducible
- this is the nature of dependency resolving that it constantly changes,
depending on what is currrently available in PyPI, what state you are with your
cache and some heuristics that might change resolutions even following the
slightest changes. What I know about this case is:
It was (consistently) happening:
1) In v2-9-test branch during the last 2 days or so
2) ONLY with Python 3.11 (!) . 3.8 - 3.10 and 3.12 were fine
(:exploding_head: )
3) In the PROD building step where we built airflow package from sources
`breeze release-mnagement prepare-airflow-package` and using that wheel package
to install airflow with all PROD extras (so all other providers and deps were
supposed to be installed from PyPi.
4) UV cache should be clean and disabled (export UV_NO_CACHE="true") - UV
cache increases size of the image almost 2x so we disable it.
5) The only packages installed in the venv were `pip==24.0` and `uv==1.28.0'
5) This was the command that failed:
> uv pip install
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,
>
docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,
> microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,
> slack,snowflake,ssh,statsd,uv,virtualenv] @
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
And even for our builds - this is a very unusual step - usually we install
airflow with constraints generated with the CI build. But this one does not use
constraints, because this is a `CACHE` build - one that produces a base PROD
image that we are using to build subsequent PROD images - and in this case, it
almost does not matter what resolution we arrrive it becasue that particular
step is going to be invalidated anyway because we will build a different
airlfow packge next time, so in this case it only matters that this step is
fast and succeeds so that all the previous layers can be used to build the next
PROD image from subsequent v2-9-test build faster.
And the error was:
```
#64 6.287 error: Failed to download: google-cloud-bigquery==1.28.2
#64 6.287 Caused by: Couldn't parse metadata of
google_cloud_bigquery-1.28.2-py2.py3-none-any.whl from
https://files.pythonhosted.org/packages/ce/af/89ccb3dd70a86516cb408dd7b7484d2fdd073bdce6405f722f75e6058e66/google_cloud_bigquery-1.28.2-py2.py3-none-any.whl.metadata
#64 6.287 Caused by: after parsing 2.0, found "de" after it, which is
not part of a valid version
#64 6.287 pyarrow (<2.0de,>=1.0.0) ; (python_version >= "3.5") and extra
== 'all'
```
You can see one of the failing builds here:
https://github.com/apache/airflow/actions/runs/8555464849/job/23446007205#step:10:3677
Corresponding builds for other Python versions resulted in:
```
#60 1.344 + uv pip install
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]
@ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
#60 9.585 Resolved 339 packages in 8.23s
#60 16.38 Downloaded 336 packages in 6.78s
#60 17.14 Installed 336 packages in 761ms
#60 17.14 + adal==1.2.7
#60 17.14 + adlfs==2024.2.0
...
```
You can see more failed runs here:
https://github.com/apache/airflow/actions?query=branch%3Av2-9-test - it WAS
consistently happening until I switched the builds to use `pip` . For example:
https://github.com/apache/airflow/actions/runs/8559085301/job/23458003710#step:10:3718
This one takes a bit longer as expected (145 s) - but works.
```
#64 2.532 + pip install --root-user-action ignore
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]
@ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
#64 3.259 Processing
/docker-context-files/apache_airflow-2.9.0-py3-none-any.whl (from
apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl)
#64 3.627 Collecting alembic<2.0,>=1.13.1 (from apache-airflow@
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl->apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl)
...
#64 145.0 Successfully installed Babel-2.14.0 Flask-Babel-2.0.0
Flask-JWT-Extended-4.6.0 Flask-Limiter-3.5.1 Flask-SQLAlchemy-2.5.1 Mako-1.3.2
PyAthena-3.6.0 PyOpenSSL-24.1.0 PyYAML-6.0.1 WTForms-3.1.2 adal-1.2.7
adlfs-2024.2.0 aiobotocore-2.12.2 aiofiles-23.2.1 aiohttp-3.9.3
aioitertools-0.11.0 aiosignal-1.3.1 alembic-1.13.1 amqp-5.2.0
annotated-types-0.6.0 anyio-4.3.0 apache-airflow-2.9.0
apache-airflow-providers-amazon-8.19.0 apache-airflow-providers-celery-3.6.1
apache-airflow-providers-cncf-kubernetes-8.0.1
apache-airflow-providers-common-io-1.3.0
apache-airflow-providers-common-sql-1.11.1
apache-airflow-providers-docker-3.9.2
apache-airflow-providers-elasticsearch-5.3.3 apache-airflow-providers-fab-1.0.2
apache-airflow-providers-ftp-3.7.0 apache-airflow-providers-google-10.16.0
apache-airflow-providers-grpc-3.4.1 apache-airflow-providers-hashicorp-3.6.4
apache-airflow-providers-http-4.10.0 apache-airflow-providers-imap-3.5.0
apache-airflow-providers-microsoft-azure-9.0.1 ap
ache-airflow-providers-mysql-5.5.4 apache-airflow-providers-odbc-4.4.1
apache-airflow-providers-openlineage-1.6.0
apache-airflow-providers-postgres-5.10.2 apache-airflow-providers-redis-3.6.0
apache-airflow-providers-sendgrid-3.4.0 apache-airflow-providers-sftp-4.9.0
apache-airflow-providers-slack-8.6.1 apache-airflow-providers-smtp-1.6.1
apache-airflow-providers-snowflake-5.3.1 apache-airflow-providers-sqlite-3.7.1
apache-airflow-providers-ssh-3.10.1 apispec-6.6.0 argcomplete-3.2.3
asgiref-3.8.1 asn1crypto-1.5.1 asyncssh-2.14.2 attrs-23.2.0 authlib-1.3.0
azure-batch-14.2.0 azure-common-1.1.28 azure-core-1.30.1 azure-cosmos-4.6.0
azure-datalake-store-0.0.53 azure-identity-1.15.0 azure-keyvault-secrets-4.8.0
azure-kusto-data-4.3.1 azure-mgmt-containerinstance-10.1.0
azure-mgmt-containerregistry-10.3.0 azure-mgmt-core-1.4.0
azure-mgmt-cosmosdb-9.4.0 azure-mgmt-datafactory-6.1.0
azure-mgmt-datalake-nspkg-3.0.1 azure-mgmt-datalake-store-0.5.0
azure-mgmt-nspkg-3.0.2 azure-mgmt-resource-2
3.0.1 azure-mgmt-storage-21.1.0 azure-nspkg-3.0.2 azure-servicebus-7.12.1
azure-storage-blob-12.19.1 azure-storage-file-datalake-12.14.0
azure-storage-file-share-12.15.0 azure-synapse-artifacts-0.18.0
azure-synapse-spark-0.7.0 backoff-2.2.1 bcrypt-4.1.2 beautifulsoup4-4.12.3
billiard-4.2.0 blinker-1.7.0 boto3-1.34.51 botocore-1.34.51 cachelib-0.9.0
cachetools-5.3.3 cattrs-23.2.3 celery-5.3.6 certifi-2024.2.2 cffi-1.16.0
chardet-5.2.0 charset-normalizer-3.3.2 click-8.1.7 click-didyoumean-0.3.1
click-plugins-1.1.1 click-repl-0.3.0 clickclick-20.10.2 colorama-0.4.6
colorlog-4.8.0 configupdater-3.2 connexion-2.14.2 cron-descriptor-1.4.3
croniter-2.0.3 cryptography-41.0.7 db-dtypes-1.2.0 decorator-5.1.1
deprecated-1.2.14 dill-0.3.8 distlib-0.3.8 dnspython-2.6.1 docker-7.0.0
docstring-parser-0.16 docutils-0.20.1 elastic-transport-8.13.0
elasticsearch-8.13.0 email-validator-2.1.1 eventlet-0.36.1 filelock-3.13.3
flask-2.2.5 flask-appbuilder-4.4.1 flask-caching-2.1.0 flask-login-0.6.3 flask-
session-0.5.0 flask-wtf-1.2.1 flower-2.0.1 frozenlist-1.4.1 fsspec-2024.3.1
gcloud-aio-auth-4.2.3 gcloud-aio-bigquery-7.1.0 gcloud-aio-storage-9.2.0
gcsfs-2024.3.1 gevent-24.2.1 google-ads-23.1.0 google-analytics-admin-0.22.7
google-api-core-2.18.0 google-api-python-client-2.125.0 google-auth-2.29.0
google-auth-httplib2-0.2.0 google-auth-oauthlib-1.2.0
google-cloud-aiplatform-1.46.0 google-cloud-appengine-logging-1.4.3
google-cloud-audit-log-0.2.5 google-cloud-automl-2.13.3
google-cloud-batch-0.17.17 google-cloud-bigquery-3.20.1
google-cloud-bigquery-datatransfer-3.15.1 google-cloud-bigtable-2.23.0
google-cloud-build-3.24.0 google-cloud-compute-1.18.0
google-cloud-container-2.45.0 google-cloud-core-2.4.1
google-cloud-datacatalog-3.19.0 google-cloud-dataflow-client-0.8.10
google-cloud-dataform-0.5.9 google-cloud-dataplex-1.13.0
google-cloud-dataproc-5.9.3 google-cloud-dataproc-metastore-1.15.3
google-cloud-dlp-3.16.0 google-cloud-kms-2.21.3 google-cloud-language-2.13.3
google-cloud-l
ogging-3.10.0 google-cloud-memcache-1.9.3 google-cloud-monitoring-2.19.3
google-cloud-orchestration-airflow-1.12.1 google-cloud-os-login-2.14.3
google-cloud-pubsub-2.21.1 google-cloud-redis-2.15.3
google-cloud-resource-manager-1.12.3 google-cloud-run-0.10.5
google-cloud-secret-manager-2.19.0 google-cloud-spanner-3.44.0
google-cloud-speech-2.26.0 google-cloud-storage-2.16.0
google-cloud-storage-transfer-1.11.3 google-cloud-tasks-2.16.3
google-cloud-texttospeech-2.16.3 google-cloud-translate-3.15.3
google-cloud-videointelligence-2.13.3 google-cloud-vision-3.7.2
google-cloud-workflows-1.14.3 google-crc32c-1.5.0 google-re2-1.1
google-resumable-media-2.7.0 googleapis-common-protos-1.63.0 graphviz-0.20.3
greenlet-3.0.3 grpc-google-iam-v1-0.13.0 grpc-interceptor-0.15.4 grpcio-1.62.1
grpcio-gcp-0.2.2 grpcio-status-1.62.1 gunicorn-21.2.0 h11-0.14.0 httpcore-1.0.5
httplib2-0.22.0 httpx-0.27.0 humanize-4.9.0 hvac-2.1.0 idna-3.6 ijson-3.2.3
importlib-resources-6.4.0 importlib_metadata-7.0.0 inf
lection-0.5.1 isodate-0.6.1 itsdangerous-2.1.2 jinja2-3.1.3 jmespath-1.0.1
json-merge-patch-0.2 jsonpath_ng-1.6.1 jsonschema-4.21.1
jsonschema-specifications-2023.12.1 kombu-5.3.6 kubernetes-29.0.0
kubernetes_asyncio-29.0.0 lazy-object-proxy-1.10.0 ldap3-2.9.1 limits-3.10.1
linkify-it-py-2.0.3 lockfile-0.12.2 looker-sdk-24.2.1 lxml-5.2.1
markdown-it-py-3.0.0 markupsafe-2.1.5 marshmallow-3.21.1
marshmallow-oneofschema-3.1.1 marshmallow-sqlalchemy-0.28.2
mdit-py-plugins-0.4.0 mdurl-0.1.2 more-itertools-10.2.0 msal-1.28.0
msal-extensions-1.1.0 msrest-0.7.1 msrestazure-0.6.4 multidict-6.0.5
mysql-connector-python-8.3.0 mysqlclient-2.2.4 numpy-1.26.4 oauthlib-3.2.2
openlineage-integration-common-1.11.1 openlineage-python-1.11.1
openlineage-sql-1.11.1 opentelemetry-api-1.24.0
opentelemetry-exporter-otlp-1.24.0
opentelemetry-exporter-otlp-proto-common-1.24.0
opentelemetry-exporter-otlp-proto-grpc-1.24.0
opentelemetry-exporter-otlp-proto-http-1.24.0 opentelemetry-proto-1.24.0
opentelemetry-
sdk-1.24.0 opentelemetry-semantic-conventions-0.45b0 ordered-set-4.1.0
pandas-2.1.4 pandas-gbq-0.22.0 paramiko-3.4.0 pathspec-0.12.1 pendulum-3.0.0
platformdirs-3.11.0 pluggy-1.4.0 ply-3.11 portalocker-2.8.2 prison-0.2.1
prometheus-client-0.20.0 prompt-toolkit-3.0.43 proto-plus-1.23.0
protobuf-4.25.3 psutil-5.9.8 psycopg2-binary-2.9.9 pyarrow-15.0.2 pyasn1-0.5.1
pyasn1-modules-0.3.0 pycparser-2.22 pydantic-2.6.4 pydantic-core-2.16.3
pydata-google-auth-1.8.2 pygments-2.17.2 pyjwt-2.8.0 pynacl-1.5.0 pyodbc-5.1.0
pyparsing-3.1.2 python-daemon-3.0.1 python-dateutil-2.9.0.post0
python-dotenv-1.0.1 python-http-client-3.3.7 python-ldap-3.4.4
python-nvd3-0.15.0 python-slugify-8.0.4 pytz-2024.1 redis-4.6.0
redshift_connector-2.1.0 referencing-0.34.0 requests-2.31.0
requests-oauthlib-2.0.0 requests_toolbelt-1.0.0 rfc3339-validator-0.1.4
rich-13.7.1 rich-argparse-1.4.0 rpds-py-0.18.0 rsa-4.9 s3transfer-0.10.1
scramp-1.4.4 sendgrid-6.11.0 setproctitle-1.3.3 setuptools-66.1.1 shapely-2.0.3
six-1
.16.0 slack_sdk-3.27.1 sniffio-1.3.1 snowflake-connector-python-3.7.1
snowflake-sqlalchemy-1.5.1 sortedcontainers-2.4.0 soupsieve-2.5
sqlalchemy-1.4.52 sqlalchemy-bigquery-1.10.0 sqlalchemy-jsonfield-1.0.2
sqlalchemy-spanner-1.6.2 sqlalchemy-utils-0.41.2 sqlalchemy_redshift-0.8.14
sqlparse-0.4.4 sshtunnel-0.4.0 starkbank-ecdsa-2.2.0 statsd-4.0.1
tabulate-0.9.0 tenacity-8.2.3 termcolor-2.4.0 text-unidecode-1.3
time-machine-2.14.1 tomlkit-0.12.4 tornado-6.4 typing-extensions-4.10.0
tzdata-2024.1 uc-micro-py-1.0.3 unicodecsv-0.14.1 universal-pathlib-0.2.2
uritemplate-4.1.1 urllib3-2.0.7 vine-5.1.0 virtualenv-20.25.1 watchtower-3.1.0
wcwidth-0.2.13 websocket-client-1.7.0 werkzeug-2.2.3 wrapt-1.16.0 yarl-1.9.4
zipp-3.18.1 zope.event-5.0 zope.interface-6.2
```
There is an issue I created about it:
https://github.com/astral-sh/uv/issues/2821 that `uv` maintainers seem to be
eager to fix quite soon, and there is a bit similar issue (at least with very
strange backtracking and failing on some old versions of transitive packages)
created by @notatallshaw who actively works on testing and verifying `pip`
and `uv` resolution algorithms and uses airflow as quite a testing ground:
https://github.com/astral-sh/uv/issues/1560 - where the heuristic of `uv` gives
different results than `pip` (which is pretty expected as in many cases -
especlally `apache-airflow[all]` there are multiple matching solutions) .
What I found so far that our CI builds `(where we use `devel-all` and
`--resolution highest` give usuallly very close results with `uv` and `pip's`
`--eager-upgrade` - so I continue using constraint generation using `uv` as it
is way faster.
But installing just `airflow[some deps]` without --highest resolution or
`--eager-upgrade` gives often quite different results for `uv` and `pip` - in
this case for example UV installed airflow with not-the-latest-google-provider
for example - that's why switching PROD runs in release branches to `pip` is
likely to stay.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]