potiuk commented on PR #38753:
URL: https://github.com/apache/airflow/pull/38753#issuecomment-2039233516

   It's been very specific - an I am not sure if it is constrantly reproducible 
- this is the nature of dependency resolving that it constantly changes, 
depending on what is currrently available in PyPI, what state you are with your 
cache and some heuristics that might change resolutions even following the 
slightest changes. What I know about this case is:
   
   It was (consistently) happening:
   
   1) In v2-9-test branch during the last 2 days or so
   2) ONLY with Python 3.11 (!) . 3.8 - 3.10 and 3.12 were fine 
(:exploding_head: )
   3) In the PROD building step where we built airflow package from sources 
`breeze release-mnagement prepare-airflow-package` and using that wheel package 
to install airflow with all PROD extras (so all other providers and deps were 
supposed to be installed from PyPi.
   4) UV cache should be clean and disabled (export UV_NO_CACHE="true") - UV 
cache increases size of the image almost 2x so we disable it.
   5) The only packages installed in the venv were `pip==24.0` and `uv==1.28.0'
   5) This was the command that failed:
   
   > uv pip install 
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,
   > 
docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,
   > microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,
   > slack,snowflake,ssh,statsd,uv,virtualenv] @ 
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
   
   And even for our builds - this is a very unusual step - usually we install 
airflow with constraints generated with the CI build. But this one does not use 
constraints, because this is a `CACHE` build - one that produces a base PROD 
image that we are using to build subsequent PROD images - and in this case, it 
almost does not matter what resolution we arrrive it becasue that particular 
step is going to be invalidated anyway because we will build a different 
airlfow packge next time, so in this case it only matters that this step is 
fast and succeeds so that all the previous layers can be used to build the next 
PROD image from subsequent v2-9-test build faster. 
   
   And the error was:
   
   ```
   #64 6.287 error: Failed to download: google-cloud-bigquery==1.28.2
     #64 6.287   Caused by: Couldn't parse metadata of 
google_cloud_bigquery-1.28.2-py2.py3-none-any.whl from 
https://files.pythonhosted.org/packages/ce/af/89ccb3dd70a86516cb408dd7b7484d2fdd073bdce6405f722f75e6058e66/google_cloud_bigquery-1.28.2-py2.py3-none-any.whl.metadata
     #64 6.287   Caused by: after parsing 2.0, found "de" after it, which is 
not part of a valid version
     #64 6.287 pyarrow (<2.0de,>=1.0.0) ; (python_version >= "3.5") and extra 
== 'all'
   ```
   
   You can see one of the failing builds here: 
https://github.com/apache/airflow/actions/runs/8555464849/job/23446007205#step:10:3677
   
   Corresponding builds for other Python versions resulted in:
   
   ```
     #60 1.344 + uv pip install 
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]
 @ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
     #60 9.585 Resolved 339 packages in 8.23s
     #60 16.38 Downloaded 336 packages in 6.78s
     #60 17.14 Installed 336 packages in 761ms
     #60 17.14  + adal==1.2.7
     #60 17.14  + adlfs==2024.2.0
   ...
   ```
   
   
   You can see more failed runs here: 
https://github.com/apache/airflow/actions?query=branch%3Av2-9-test  - it WAS 
consistently happening until I switched the builds to use `pip` . For example:
   
   
https://github.com/apache/airflow/actions/runs/8559085301/job/23458003710#step:10:3718
   
   This one takes a bit longer as expected (145 s) - but works.
   
   ```
    #64 2.532 + pip install --root-user-action ignore 
'apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]
 @ file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl'
     #64 3.259 Processing 
/docker-context-files/apache_airflow-2.9.0-py3-none-any.whl (from 
apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@
 file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl)
     #64 3.627 Collecting alembic<2.0,>=1.13.1 (from apache-airflow@ 
file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl->apache-airflow[aiobotocore,amazon,async,celery,cncf-kubernetes,common-io,docker,elasticsearch,ftp,google,google-auth,graphviz,grpc,hashicorp,http,ldap,microsoft-azure,mysql,odbc,openlineage,pandas,postgres,redis,sendgrid,sftp,slack,snowflake,ssh,statsd,uv,virtualenv]@
 file:///docker-context-files/apache_airflow-2.9.0-py3-none-any.whl)
   ...
    #64 145.0 Successfully installed Babel-2.14.0 Flask-Babel-2.0.0 
Flask-JWT-Extended-4.6.0 Flask-Limiter-3.5.1 Flask-SQLAlchemy-2.5.1 Mako-1.3.2 
PyAthena-3.6.0 PyOpenSSL-24.1.0 PyYAML-6.0.1 WTForms-3.1.2 adal-1.2.7 
adlfs-2024.2.0 aiobotocore-2.12.2 aiofiles-23.2.1 aiohttp-3.9.3 
aioitertools-0.11.0 aiosignal-1.3.1 alembic-1.13.1 amqp-5.2.0 
annotated-types-0.6.0 anyio-4.3.0 apache-airflow-2.9.0 
apache-airflow-providers-amazon-8.19.0 apache-airflow-providers-celery-3.6.1 
apache-airflow-providers-cncf-kubernetes-8.0.1 
apache-airflow-providers-common-io-1.3.0 
apache-airflow-providers-common-sql-1.11.1 
apache-airflow-providers-docker-3.9.2 
apache-airflow-providers-elasticsearch-5.3.3 apache-airflow-providers-fab-1.0.2 
apache-airflow-providers-ftp-3.7.0 apache-airflow-providers-google-10.16.0 
apache-airflow-providers-grpc-3.4.1 apache-airflow-providers-hashicorp-3.6.4 
apache-airflow-providers-http-4.10.0 apache-airflow-providers-imap-3.5.0 
apache-airflow-providers-microsoft-azure-9.0.1 ap
 ache-airflow-providers-mysql-5.5.4 apache-airflow-providers-odbc-4.4.1 
apache-airflow-providers-openlineage-1.6.0 
apache-airflow-providers-postgres-5.10.2 apache-airflow-providers-redis-3.6.0 
apache-airflow-providers-sendgrid-3.4.0 apache-airflow-providers-sftp-4.9.0 
apache-airflow-providers-slack-8.6.1 apache-airflow-providers-smtp-1.6.1 
apache-airflow-providers-snowflake-5.3.1 apache-airflow-providers-sqlite-3.7.1 
apache-airflow-providers-ssh-3.10.1 apispec-6.6.0 argcomplete-3.2.3 
asgiref-3.8.1 asn1crypto-1.5.1 asyncssh-2.14.2 attrs-23.2.0 authlib-1.3.0 
azure-batch-14.2.0 azure-common-1.1.28 azure-core-1.30.1 azure-cosmos-4.6.0 
azure-datalake-store-0.0.53 azure-identity-1.15.0 azure-keyvault-secrets-4.8.0 
azure-kusto-data-4.3.1 azure-mgmt-containerinstance-10.1.0 
azure-mgmt-containerregistry-10.3.0 azure-mgmt-core-1.4.0 
azure-mgmt-cosmosdb-9.4.0 azure-mgmt-datafactory-6.1.0 
azure-mgmt-datalake-nspkg-3.0.1 azure-mgmt-datalake-store-0.5.0 
azure-mgmt-nspkg-3.0.2 azure-mgmt-resource-2
 3.0.1 azure-mgmt-storage-21.1.0 azure-nspkg-3.0.2 azure-servicebus-7.12.1 
azure-storage-blob-12.19.1 azure-storage-file-datalake-12.14.0 
azure-storage-file-share-12.15.0 azure-synapse-artifacts-0.18.0 
azure-synapse-spark-0.7.0 backoff-2.2.1 bcrypt-4.1.2 beautifulsoup4-4.12.3 
billiard-4.2.0 blinker-1.7.0 boto3-1.34.51 botocore-1.34.51 cachelib-0.9.0 
cachetools-5.3.3 cattrs-23.2.3 celery-5.3.6 certifi-2024.2.2 cffi-1.16.0 
chardet-5.2.0 charset-normalizer-3.3.2 click-8.1.7 click-didyoumean-0.3.1 
click-plugins-1.1.1 click-repl-0.3.0 clickclick-20.10.2 colorama-0.4.6 
colorlog-4.8.0 configupdater-3.2 connexion-2.14.2 cron-descriptor-1.4.3 
croniter-2.0.3 cryptography-41.0.7 db-dtypes-1.2.0 decorator-5.1.1 
deprecated-1.2.14 dill-0.3.8 distlib-0.3.8 dnspython-2.6.1 docker-7.0.0 
docstring-parser-0.16 docutils-0.20.1 elastic-transport-8.13.0 
elasticsearch-8.13.0 email-validator-2.1.1 eventlet-0.36.1 filelock-3.13.3 
flask-2.2.5 flask-appbuilder-4.4.1 flask-caching-2.1.0 flask-login-0.6.3 flask-
 session-0.5.0 flask-wtf-1.2.1 flower-2.0.1 frozenlist-1.4.1 fsspec-2024.3.1 
gcloud-aio-auth-4.2.3 gcloud-aio-bigquery-7.1.0 gcloud-aio-storage-9.2.0 
gcsfs-2024.3.1 gevent-24.2.1 google-ads-23.1.0 google-analytics-admin-0.22.7 
google-api-core-2.18.0 google-api-python-client-2.125.0 google-auth-2.29.0 
google-auth-httplib2-0.2.0 google-auth-oauthlib-1.2.0 
google-cloud-aiplatform-1.46.0 google-cloud-appengine-logging-1.4.3 
google-cloud-audit-log-0.2.5 google-cloud-automl-2.13.3 
google-cloud-batch-0.17.17 google-cloud-bigquery-3.20.1 
google-cloud-bigquery-datatransfer-3.15.1 google-cloud-bigtable-2.23.0 
google-cloud-build-3.24.0 google-cloud-compute-1.18.0 
google-cloud-container-2.45.0 google-cloud-core-2.4.1 
google-cloud-datacatalog-3.19.0 google-cloud-dataflow-client-0.8.10 
google-cloud-dataform-0.5.9 google-cloud-dataplex-1.13.0 
google-cloud-dataproc-5.9.3 google-cloud-dataproc-metastore-1.15.3 
google-cloud-dlp-3.16.0 google-cloud-kms-2.21.3 google-cloud-language-2.13.3 
google-cloud-l
 ogging-3.10.0 google-cloud-memcache-1.9.3 google-cloud-monitoring-2.19.3 
google-cloud-orchestration-airflow-1.12.1 google-cloud-os-login-2.14.3 
google-cloud-pubsub-2.21.1 google-cloud-redis-2.15.3 
google-cloud-resource-manager-1.12.3 google-cloud-run-0.10.5 
google-cloud-secret-manager-2.19.0 google-cloud-spanner-3.44.0 
google-cloud-speech-2.26.0 google-cloud-storage-2.16.0 
google-cloud-storage-transfer-1.11.3 google-cloud-tasks-2.16.3 
google-cloud-texttospeech-2.16.3 google-cloud-translate-3.15.3 
google-cloud-videointelligence-2.13.3 google-cloud-vision-3.7.2 
google-cloud-workflows-1.14.3 google-crc32c-1.5.0 google-re2-1.1 
google-resumable-media-2.7.0 googleapis-common-protos-1.63.0 graphviz-0.20.3 
greenlet-3.0.3 grpc-google-iam-v1-0.13.0 grpc-interceptor-0.15.4 grpcio-1.62.1 
grpcio-gcp-0.2.2 grpcio-status-1.62.1 gunicorn-21.2.0 h11-0.14.0 httpcore-1.0.5 
httplib2-0.22.0 httpx-0.27.0 humanize-4.9.0 hvac-2.1.0 idna-3.6 ijson-3.2.3 
importlib-resources-6.4.0 importlib_metadata-7.0.0 inf
 lection-0.5.1 isodate-0.6.1 itsdangerous-2.1.2 jinja2-3.1.3 jmespath-1.0.1 
json-merge-patch-0.2 jsonpath_ng-1.6.1 jsonschema-4.21.1 
jsonschema-specifications-2023.12.1 kombu-5.3.6 kubernetes-29.0.0 
kubernetes_asyncio-29.0.0 lazy-object-proxy-1.10.0 ldap3-2.9.1 limits-3.10.1 
linkify-it-py-2.0.3 lockfile-0.12.2 looker-sdk-24.2.1 lxml-5.2.1 
markdown-it-py-3.0.0 markupsafe-2.1.5 marshmallow-3.21.1 
marshmallow-oneofschema-3.1.1 marshmallow-sqlalchemy-0.28.2 
mdit-py-plugins-0.4.0 mdurl-0.1.2 more-itertools-10.2.0 msal-1.28.0 
msal-extensions-1.1.0 msrest-0.7.1 msrestazure-0.6.4 multidict-6.0.5 
mysql-connector-python-8.3.0 mysqlclient-2.2.4 numpy-1.26.4 oauthlib-3.2.2 
openlineage-integration-common-1.11.1 openlineage-python-1.11.1 
openlineage-sql-1.11.1 opentelemetry-api-1.24.0 
opentelemetry-exporter-otlp-1.24.0 
opentelemetry-exporter-otlp-proto-common-1.24.0 
opentelemetry-exporter-otlp-proto-grpc-1.24.0 
opentelemetry-exporter-otlp-proto-http-1.24.0 opentelemetry-proto-1.24.0 
opentelemetry-
 sdk-1.24.0 opentelemetry-semantic-conventions-0.45b0 ordered-set-4.1.0 
pandas-2.1.4 pandas-gbq-0.22.0 paramiko-3.4.0 pathspec-0.12.1 pendulum-3.0.0 
platformdirs-3.11.0 pluggy-1.4.0 ply-3.11 portalocker-2.8.2 prison-0.2.1 
prometheus-client-0.20.0 prompt-toolkit-3.0.43 proto-plus-1.23.0 
protobuf-4.25.3 psutil-5.9.8 psycopg2-binary-2.9.9 pyarrow-15.0.2 pyasn1-0.5.1 
pyasn1-modules-0.3.0 pycparser-2.22 pydantic-2.6.4 pydantic-core-2.16.3 
pydata-google-auth-1.8.2 pygments-2.17.2 pyjwt-2.8.0 pynacl-1.5.0 pyodbc-5.1.0 
pyparsing-3.1.2 python-daemon-3.0.1 python-dateutil-2.9.0.post0 
python-dotenv-1.0.1 python-http-client-3.3.7 python-ldap-3.4.4 
python-nvd3-0.15.0 python-slugify-8.0.4 pytz-2024.1 redis-4.6.0 
redshift_connector-2.1.0 referencing-0.34.0 requests-2.31.0 
requests-oauthlib-2.0.0 requests_toolbelt-1.0.0 rfc3339-validator-0.1.4 
rich-13.7.1 rich-argparse-1.4.0 rpds-py-0.18.0 rsa-4.9 s3transfer-0.10.1 
scramp-1.4.4 sendgrid-6.11.0 setproctitle-1.3.3 setuptools-66.1.1 shapely-2.0.3 
six-1
 .16.0 slack_sdk-3.27.1 sniffio-1.3.1 snowflake-connector-python-3.7.1 
snowflake-sqlalchemy-1.5.1 sortedcontainers-2.4.0 soupsieve-2.5 
sqlalchemy-1.4.52 sqlalchemy-bigquery-1.10.0 sqlalchemy-jsonfield-1.0.2 
sqlalchemy-spanner-1.6.2 sqlalchemy-utils-0.41.2 sqlalchemy_redshift-0.8.14 
sqlparse-0.4.4 sshtunnel-0.4.0 starkbank-ecdsa-2.2.0 statsd-4.0.1 
tabulate-0.9.0 tenacity-8.2.3 termcolor-2.4.0 text-unidecode-1.3 
time-machine-2.14.1 tomlkit-0.12.4 tornado-6.4 typing-extensions-4.10.0 
tzdata-2024.1 uc-micro-py-1.0.3 unicodecsv-0.14.1 universal-pathlib-0.2.2 
uritemplate-4.1.1 urllib3-2.0.7 vine-5.1.0 virtualenv-20.25.1 watchtower-3.1.0 
wcwidth-0.2.13 websocket-client-1.7.0 werkzeug-2.2.3 wrapt-1.16.0 yarl-1.9.4 
zipp-3.18.1 zope.event-5.0 zope.interface-6.2
   ```
   
   There is an issue I created about it: 
https://github.com/astral-sh/uv/issues/2821 that `uv` maintainers seem to be 
eager to fix quite soon,  and there is a bit similar issue (at least with very 
strange backtracking and failing on some old versions of transitive packages) 
created by @notatallshaw  who actively works on testing and verifying `pip`  
and `uv` resolution algorithms and uses airflow as quite a testing ground: 
https://github.com/astral-sh/uv/issues/1560 - where the heuristic of `uv` gives 
different results than `pip` (which is pretty expected as in many cases - 
especlally `apache-airflow[all]` there are multiple matching solutions) . 
   
   What I found so far that our CI builds `(where we use `devel-all` and 
`--resolution highest` give usuallly very close results with `uv` and `pip's` 
`--eager-upgrade`  - so I continue using constraint generation using `uv` as it 
is way faster. 
   
   But installing just `airflow[some deps]` without --highest resolution or 
`--eager-upgrade` gives often quite different results for `uv` and `pip` - in 
this case for example UV installed airflow with not-the-latest-google-provider 
for example - that's why switching PROD runs in release branches to `pip` is 
likely to stay.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to