potiuk commented on PR #38753: URL: https://github.com/apache/airflow/pull/38753#issuecomment-2039233516
It's been very specific - an I am not sure if it is constrantly reproducible - this is the nature of dependency resolving that it constantly changes, depending on what is currrently available in PyPI, what state you are with your cache and some heuristics that might change resolutions even following the slightest changes. What I know about this case is: It was (consistently) happening: 1) In v2-9-test branch during the last 2 days or so 2) ONLY with Python 3.11 (!) . 3.8 - 3.10 and 3.12 were fine (:exploding_head: ) 3) In the PROD building step where we built airflow package from sources `breeze release-mnagement prepare-airflow-package` and using that wheel package to install airflow with all PROD extras (so all other providers and deps were supposed to be installed from PyPi. 4) UV cache should be clean and disabled (export UV_NO_CACHE="true") - UV cache increases size of the image almost 2x so we disable it. 5) The only packages installed in the venv were `pip==24.0` and `uv==1.28.0' 5) This was the command that failed: > uv pip install 'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io, > docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap, > microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp, > slack,snowflake,ssh,statsd,uv,virtualenv] @ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl' And even for our builds - this is a very unusual step - usually we install airflow with constraints generated with the CI build. But this one does not use constraints, because this is a `CACHE` build - one that produces a base PROD image that we are using to build subsequent PROD images - and in this case, it almost does not matter what resolution we arrrive it becasue that particular step is going to be invalidated anyway because we will build a different airlfow packge next time, so in this case it only matters that this step is fast and succeeds so that all the previous layers can be used to build the next PROD image from subsequent v2-9-test build faster. And the error was: ``` #64 6.287 error: Failed to download: google-cloud-bigquery==1.28.2 #64 6.287 Caused by: Couldn't parse metadata of google_cloud_bigquery-1.28.2-py2.py3-none-any.whl from https://files.pythonhosted.org/packages/ce/af/89ccb3dd70a86516cb408dd7b7484d2fdd073bdce6405f722f75e6058e66/google_cloud_bigquery-1.28.2-py2.py3-none-any.whl.metadata #64 6.287 Caused by: after parsing 2.0, found "de" after it, which is not part of a valid version #64 6.287 pyarrow (<2.0de,>=1.0.0) ; (python_version >= "3.5") and extra == 'all' ``` You can see one of the failing builds here: https://github.com/apache/airflow/actions/runs/8555464849/job/23446007205#step:10:3677 Corresponding builds for other Python versions resulted in: ``` #60 1.344 + uv pip install 'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv] @ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl' #60 9.585 Resolved 339 packages in 8.23s #60 16.38 Downloaded 336 packages in 6.78s #60 17.14 Installed 336 packages in 761ms #60 17.14 + adal==1.2.7 #60 17.14 + adlfs==2024.2.0 ... ``` You can see more failed runs here: https://github.com/apache/airflow/actions?query=branch%3Av2-9-test - it WAS consistently happening until I switched the builds to use `pip` . For example: https://github.com/apache/airflow/actions/runs/8559085301/job/23458003710#step:10:3718 This one takes a bit longer as expected (145 s) - but works. ``` #64 2.532 + pip install --root-user-action ignore 'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv] @ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl' #64 3.259 Processing /docker-context-files/apache_airflow-2.9.0-py3-none-any.whl (from apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl) #64 3.627 Collecting alembic<2.0,>=1.13.1 (from apache-airflow@ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl->apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl) ... #64 145.0 Successfully installed Babel-2.14.0 Flask-Babel-2.0.0 Flask-JWT-Extended-4.6.0 Flask-Limiter-3.5.1 Flask-SQLAlchemy-2.5.1 Mako-1.3.2 PyAthena-3.6.0 PyOpenSSL-24.1.0 PyYAML-6.0.1 WTForms-3.1.2 adal-1.2.7 adlfs-2024.2.0 aiobotocore-2.12.2 aiofiles-23.2.1 aiohttp-3.9.3 aioitertools-0.11.0 aiosignal-1.3.1 alembic-1.13.1 amqp-5.2.0 annotated-types-0.6.0 anyio-4.3.0 apache-airflow-2.9.0 apache-airflow-providers-amazon-8.19.0 apache-airflow-providers-celery-3.6.1 apache-airflow-providers-cncf-kubernetes-8.0.1 apache-airflow-providers-common-io-1.3.0 apache-airflow-providers-common-sql-1.11.1 apache-airflow-providers-docker-3.9.2 apache-airflow-providers-elasticsearch-5.3.3 apache-airflow-providers-fab-1.0.2 apache-airflow-providers-ftp-3.7.0 apache-airflow-providers-google-10.16.0 apache-airflow-providers-grpc-3.4.1 apache-airflow-providers-hashicorp-3.6.4 apache-airflow-providers-http-4.10.0 apache-airflow-providers-imap-3.5.0 apache-airflow-providers-microsoft-azure-9.0.1 ap ache-airflow-providers-mysql-5.5.4 apache-airflow-providers-odbc-4.4.1 apache-airflow-providers-openlineage-1.6.0 apache-airflow-providers-postgres-5.10.2 apache-airflow-providers-redis-3.6.0 apache-airflow-providers-sendgrid-3.4.0 apache-airflow-providers-sftp-4.9.0 apache-airflow-providers-slack-8.6.1 apache-airflow-providers-smtp-1.6.1 apache-airflow-providers-snowflake-5.3.1 apache-airflow-providers-sqlite-3.7.1 apache-airflow-providers-ssh-3.10.1 apispec-6.6.0 argcomplete-3.2.3 asgiref-3.8.1 asn1crypto-1.5.1 asyncssh-2.14.2 attrs-23.2.0 authlib-1.3.0 azure-batch-14.2.0 azure-common-1.1.28 azure-core-1.30.1 azure-cosmos-4.6.0 azure-datalake-store-0.0.53 azure-identity-1.15.0 azure-keyvault-secrets-4.8.0 azure-kusto-data-4.3.1 azure-mgmt-containerinstance-10.1.0 azure-mgmt-containerregistry-10.3.0 azure-mgmt-core-1.4.0 azure-mgmt-cosmosdb-9.4.0 azure-mgmt-datafactory-6.1.0 azure-mgmt-datalake-nspkg-3.0.1 azure-mgmt-datalake-store-0.5.0 azure-mgmt-nspkg-3.0.2 azure-mgmt-resource-2 3.0.1 azure-mgmt-storage-21.1.0 azure-nspkg-3.0.2 azure-servicebus-7.12.1 azure-storage-blob-12.19.1 azure-storage-file-datalake-12.14.0 azure-storage-file-share-12.15.0 azure-synapse-artifacts-0.18.0 azure-synapse-spark-0.7.0 backoff-2.2.1 bcrypt-4.1.2 beautifulsoup4-4.12.3 billiard-4.2.0 blinker-1.7.0 boto3-1.34.51 botocore-1.34.51 cachelib-0.9.0 cachetools-5.3.3 cattrs-23.2.3 celery-5.3.6 certifi-2024.2.2 cffi-1.16.0 chardet-5.2.0 charset-normalizer-3.3.2 click-8.1.7 click-didyoumean-0.3.1 click-plugins-1.1.1 click-repl-0.3.0 clickclick-20.10.2 colorama-0.4.6 colorlog-4.8.0 configupdater-3.2 connexion-2.14.2 cron-descriptor-1.4.3 croniter-2.0.3 cryptography-41.0.7 db-dtypes-1.2.0 decorator-5.1.1 deprecated-1.2.14 dill-0.3.8 distlib-0.3.8 dnspython-2.6.1 docker-7.0.0 docstring-parser-0.16 docutils-0.20.1 elastic-transport-8.13.0 elasticsearch-8.13.0 email-validator-2.1.1 eventlet-0.36.1 filelock-3.13.3 flask-2.2.5 flask-appbuilder-4.4.1 flask-caching-2.1.0 flask-login-0.6.3 flask- session-0.5.0 flask-wtf-1.2.1 flower-2.0.1 frozenlist-1.4.1 fsspec-2024.3.1 gcloud-aio-auth-4.2.3 gcloud-aio-bigquery-7.1.0 gcloud-aio-storage-9.2.0 gcsfs-2024.3.1 gevent-24.2.1 google-ads-23.1.0 google-analytics-admin-0.22.7 google-api-core-2.18.0 google-api-python-client-2.125.0 google-auth-2.29.0 google-auth-httplib2-0.2.0 google-auth-oauthlib-1.2.0 google-cloud-aiplatform-1.46.0 google-cloud-appengine-logging-1.4.3 google-cloud-audit-log-0.2.5 google-cloud-automl-2.13.3 google-cloud-batch-0.17.17 google-cloud-bigquery-3.20.1 google-cloud-bigquery-datatransfer-3.15.1 google-cloud-bigtable-2.23.0 google-cloud-build-3.24.0 google-cloud-compute-1.18.0 google-cloud-container-2.45.0 google-cloud-core-2.4.1 google-cloud-datacatalog-3.19.0 google-cloud-dataflow-client-0.8.10 google-cloud-dataform-0.5.9 google-cloud-dataplex-1.13.0 google-cloud-dataproc-5.9.3 google-cloud-dataproc-metastore-1.15.3 google-cloud-dlp-3.16.0 google-cloud-kms-2.21.3 google-cloud-language-2.13.3 google-cloud-l ogging-3.10.0 google-cloud-memcache-1.9.3 google-cloud-monitoring-2.19.3 google-cloud-orchestration-airflow-1.12.1 google-cloud-os-login-2.14.3 google-cloud-pubsub-2.21.1 google-cloud-redis-2.15.3 google-cloud-resource-manager-1.12.3 google-cloud-run-0.10.5 google-cloud-secret-manager-2.19.0 google-cloud-spanner-3.44.0 google-cloud-speech-2.26.0 google-cloud-storage-2.16.0 google-cloud-storage-transfer-1.11.3 google-cloud-tasks-2.16.3 google-cloud-texttospeech-2.16.3 google-cloud-translate-3.15.3 google-cloud-videointelligence-2.13.3 google-cloud-vision-3.7.2 google-cloud-workflows-1.14.3 google-crc32c-1.5.0 google-re2-1.1 google-resumable-media-2.7.0 googleapis-common-protos-1.63.0 graphviz-0.20.3 greenlet-3.0.3 grpc-google-iam-v1-0.13.0 grpc-interceptor-0.15.4 grpcio-1.62.1 grpcio-gcp-0.2.2 grpcio-status-1.62.1 gunicorn-21.2.0 h11-0.14.0 httpcore-1.0.5 httplib2-0.22.0 httpx-0.27.0 humanize-4.9.0 hvac-2.1.0 idna-3.6 ijson-3.2.3 importlib-resources-6.4.0 importlib_metadata-7.0.0 inf lection-0.5.1 isodate-0.6.1 itsdangerous-2.1.2 jinja2-3.1.3 jmespath-1.0.1 json-merge-patch-0.2 jsonpath_ng-1.6.1 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 kombu-5.3.6 kubernetes-29.0.0 kubernetes_asyncio-29.0.0 lazy-object-proxy-1.10.0 ldap3-2.9.1 limits-3.10.1 linkify-it-py-2.0.3 lockfile-0.12.2 looker-sdk-24.2.1 lxml-5.2.1 markdown-it-py-3.0.0 markupsafe-2.1.5 marshmallow-3.21.1 marshmallow-oneofschema-3.1.1 marshmallow-sqlalchemy-0.28.2 mdit-py-plugins-0.4.0 mdurl-0.1.2 more-itertools-10.2.0 msal-1.28.0 msal-extensions-1.1.0 msrest-0.7.1 msrestazure-0.6.4 multidict-6.0.5 mysql-connector-python-8.3.0 mysqlclient-2.2.4 numpy-1.26.4 oauthlib-3.2.2 openlineage-integration-common-1.11.1 openlineage-python-1.11.1 openlineage-sql-1.11.1 opentelemetry-api-1.24.0 opentelemetry-exporter-otlp-1.24.0 opentelemetry-exporter-otlp-proto-common-1.24.0 opentelemetry-exporter-otlp-proto-grpc-1.24.0 opentelemetry-exporter-otlp-proto-http-1.24.0 opentelemetry-proto-1.24.0 opentelemetry- sdk-1.24.0 opentelemetry-semantic-conventions-0.45b0 ordered-set-4.1.0 pandas-2.1.4 pandas-gbq-0.22.0 paramiko-3.4.0 pathspec-0.12.1 pendulum-3.0.0 platformdirs-3.11.0 pluggy-1.4.0 ply-3.11 portalocker-2.8.2 prison-0.2.1 prometheus-client-0.20.0 prompt-toolkit-3.0.43 proto-plus-1.23.0 protobuf-4.25.3 psutil-5.9.8 psycopg2-binary-2.9.9 pyarrow-15.0.2 pyasn1-0.5.1 pyasn1-modules-0.3.0 pycparser-2.22 pydantic-2.6.4 pydantic-core-2.16.3 pydata-google-auth-1.8.2 pygments-2.17.2 pyjwt-2.8.0 pynacl-1.5.0 pyodbc-5.1.0 pyparsing-3.1.2 python-daemon-3.0.1 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 python-http-client-3.3.7 python-ldap-3.4.4 python-nvd3-0.15.0 python-slugify-8.0.4 pytz-2024.1 redis-4.6.0 redshift_connector-2.1.0 referencing-0.34.0 requests-2.31.0 requests-oauthlib-2.0.0 requests_toolbelt-1.0.0 rfc3339-validator-0.1.4 rich-13.7.1 rich-argparse-1.4.0 rpds-py-0.18.0 rsa-4.9 s3transfer-0.10.1 scramp-1.4.4 sendgrid-6.11.0 setproctitle-1.3.3 setuptools-66.1.1 shapely-2.0.3 six-1 .16.0 slack_sdk-3.27.1 sniffio-1.3.1 snowflake-connector-python-3.7.1 snowflake-sqlalchemy-1.5.1 sortedcontainers-2.4.0 soupsieve-2.5 sqlalchemy-1.4.52 sqlalchemy-bigquery-1.10.0 sqlalchemy-jsonfield-1.0.2 sqlalchemy-spanner-1.6.2 sqlalchemy-utils-0.41.2 sqlalchemy_redshift-0.8.14 sqlparse-0.4.4 sshtunnel-0.4.0 starkbank-ecdsa-2.2.0 statsd-4.0.1 tabulate-0.9.0 tenacity-8.2.3 termcolor-2.4.0 text-unidecode-1.3 time-machine-2.14.1 tomlkit-0.12.4 tornado-6.4 typing-extensions-4.10.0 tzdata-2024.1 uc-micro-py-1.0.3 unicodecsv-0.14.1 universal-pathlib-0.2.2 uritemplate-4.1.1 urllib3-2.0.7 vine-5.1.0 virtualenv-20.25.1 watchtower-3.1.0 wcwidth-0.2.13 websocket-client-1.7.0 werkzeug-2.2.3 wrapt-1.16.0 yarl-1.9.4 zipp-3.18.1 zope.event-5.0 zope.interface-6.2 ``` There is an issue I created about it: https://github.com/astral-sh/uv/issues/2821 that `uv` maintainers seem to be eager to fix quite soon, and there is a bit similar issue (at least with very strange backtracking and failing on some old versions of transitive packages) created by @notatallshaw who actively works on testing and verifying `pip` and `uv` resolution algorithms and uses airflow as quite a testing ground: https://github.com/astral-sh/uv/issues/1560 - where the heuristic of `uv` gives different results than `pip` (which is pretty expected as in many cases - especlally `apache-airflow[all]` there are multiple matching solutions) . What I found so far that our CI builds `(where we use `devel-all` and `--resolution highest` give usuallly very close results with `uv` and `pip's` `--eager-upgrade` - so I continue using constraint generation using `uv` as it is way faster. But installing just `airflow[some deps]` without --highest resolution or `--eager-upgrade` gives often quite different results for `uv` and `pip` - in this case for example UV installed airflow with not-the-latest-google-provider for example - that's why switching PROD runs in release branches to `pip` is likely to stay. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org