pedro-cf commented on issue #41328:
URL: https://github.com/apache/airflow/issues/41328#issuecomment-2283478774

   > You can't do it. You do not know what versions will be installed before 
you install it, at which point calculating hash is already too late - because 
you already installed the venv.
   > 
   > Technically speaking -if you do not specify `==` in all requirements 
(which you should in this case if you want reproducibilitty) using last 
installed venv snapshot is fully correct (it still follows the specification 
you gave it).
   > 
   > If you want full reproducibility - just pin all your requirements, that's 
really the only way.
   
   It is possible to perform a  `pip download -r requirements.txt` which will  
technically parse the versions and download them **if ** they are missing from 
the download location. we could then parse than to generate the hash ?
   
   example:
   
   `requirements.txt`
   ```bash
   pandas
   colormap==1.0.4
   ```
   
   `pip download -r requirements.txt`
   ```bash
   Collecting pandas (from -r requirements.txt (line 1))
     Using cached 
pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
 (19 kB)
   Collecting colormap==1.0.4 (from -r requirements.txt (line 2))
     Using cached colormap-1.0.4.tar.gz (17 kB)
     Preparing metadata (setup.py) ... done
   Collecting numpy>=1.22.4 (from pandas->-r requirements.txt (line 1))
     Using cached 
numpy-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata 
(60 kB)
   Collecting python-dateutil>=2.8.2 (from pandas->-r requirements.txt (line 1))
     Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata 
(8.4 kB)
   Collecting pytz>=2020.1 (from pandas->-r requirements.txt (line 1))
     Using cached pytz-2024.1-py2.py3-none-any.whl.metadata (22 kB)
   Collecting tzdata>=2022.7 (from pandas->-r requirements.txt (line 1))
     Using cached tzdata-2024.1-py2.py3-none-any.whl.metadata (1.4 kB)
   Collecting six>=1.5 (from python-dateutil>=2.8.2->pandas->-r 
requirements.txt (line 1))
     Using cached six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
   ```
   
   `tree .`
   ```bash
   .
   ├── colormap-1.0.4.tar.gz
   ├── numpy-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
   ├── pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
   ├── python_dateutil-2.9.0.post0-py2.py3-none-any.whl
   ├── pytz-2024.1-py2.py3-none-any.whl
   ├── requirements.txt
   ├── six-1.16.0-py2.py3-none-any.whl
   └── tzdata-2024.1-py2.py3-none-any.whl
   ```
   
   Download when the packages are already downlaoded:
   `pip download -r requirements.txt` 
   ```
   Collecting pandas (from -r requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
   Collecting colormap==1.0.4 (from -r requirements.txt (line 2))
     File was already downloaded /mnt/c/git/tst/tst/colormap-1.0.4.tar.gz
     Preparing metadata (setup.py) ... done
   Collecting numpy>=1.22.4 (from pandas->-r requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/numpy-2.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
   Collecting python-dateutil>=2.8.2 (from pandas->-r requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/python_dateutil-2.9.0.post0-py2.py3-none-any.whl
   Collecting pytz>=2020.1 (from pandas->-r requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/pytz-2024.1-py2.py3-none-any.whl
   Collecting tzdata>=2022.7 (from pandas->-r requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/tzdata-2024.1-py2.py3-none-any.whl
   Collecting six>=1.5 (from python-dateutil>=2.8.2->pandas->-r 
requirements.txt (line 1))
     File was already downloaded 
/mnt/c/git/tst/tst/six-1.16.0-py2.py3-none-any.whl
   Successfully downloaded colormap pandas numpy python-dateutil pytz tzdata six
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to