shahar1 commented on code in PR #42000:
URL: https://github.com/apache/airflow/pull/42000#discussion_r1744998136


##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually

Review Comment:
   ```suggestion
   Very rarely the CI image in `canary` builds take a very long time to build. 
This is usually
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.

Review Comment:
   ```suggestion
   to be found compatible with each other. In case new dependencies are 
released, `pip` might enter
   a long loop trying to figure out if these dependencies can be used. 
Unfortunately, this long loop could end up in an error due to conflicts.
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the 
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the 
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for 
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully 
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.

Review Comment:
   ```suggestion
   We use a "bisecting" methodology to test candidates for backtrack triggering 
among those released in PyPI since the last successful run
   of ``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the 
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the 
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for 
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully 
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is 
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints 
at some point of time and
+it tries to find out what dependencies have been added since then and limit 
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI 
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times 
out, so the
+easiest way to find backtracking candidates is to find the first build that 
failed with timeout - it

Review Comment:
   ```suggestion
   Instead of running the command manually, you could also rely on the failing 
CI builds. We run the
   `find-backtracking-candidates` command in the `canary` build when it times 
out, so the
   easiest way to find backtracking candidates is to find the first build that 
failed with timeout. It
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the 
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the 
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for 
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully 
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is 
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints 
at some point of time and
+it tries to find out what dependencies have been added since then and limit 
them to the version that
+was used in the constraints.

Review Comment:
   ```suggestion
   This command should be run quickly after noticing that the CI build is 
taking a long time and failing. It relies on the fact that an eager upgrade 
produced valid constraints at some point of time, so it tries to identify which 
dependencies have been added since then. By doing it, it limits them to the 
versions used in the constraints.
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the 
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the 
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for 
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully 
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is 
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints 
at some point of time and
+it tries to find out what dependencies have been added since then and limit 
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI 
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times 
out, so the
+easiest way to find backtracking candidates is to find the first build that 
failed with timeout - it
+will likely have the smallest number of backtracking candidates. The command 
outputs the limitation
+for those backtracking candidates that are guaranteed to work (because they 
are taken from the latest
+constraints and they already succeeded in the past when the constraints were 
updated).
+
+Then we run ``breeze ci-image build --upgrade-to-newer-dependencies 
--eager-upgrade-additional-requirements "REQUIREMENTS"``
+to check which of the candidates causes the long builds. Initially you put 
there the whole list of
+candidates that you got from the `find-backtracking-candidates` command. This 
**should** succeed. Now,
+the next step is to narrow down the list of candidates to the one that is 
causing the backtracking.
+
+We narrow-down the list by "bisecting" the list. We remove half of the 
dependency limits and see if it
+still works or not. It works - we continue. If it does not work, we restore 
the removed half and remove
+the other half. Rinse and repeat until there is only one dependency left - 
hopefully
+(sometimes you will need to leave few of them).

Review Comment:
   You made me laugh real hard with "Rinse and repeat", but I don't think that 
it fits this kind of docs :)
   
   ```suggestion
   still works or not. If it works, we continue. Otherwise, we restore the 
removed half and remove
   the other half. Repeat until there is only one dependency left - hopefully
   (sometimes you will need to leave few of them).
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.

Review Comment:
   ```suggestion
   This is why we need to help pip to skip newer versions of those 
dependencies, until the
   condition that caused the backtracking is solved.
   ```



##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints 
--constraints-repo /home/user/airfl
     --commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0" 
\
     --airflow-constraints-mode constraints
 ```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time 
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager 
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it 
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of 
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released, 
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used, 
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time 
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously 
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those 
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the 
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the 
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for 
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully 
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the 
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is 
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints 
at some point of time and
+it tries to find out what dependencies have been added since then and limit 
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI 
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times 
out, so the
+easiest way to find backtracking candidates is to find the first build that 
failed with timeout - it
+will likely have the smallest number of backtracking candidates. The command 
outputs the limitation
+for those backtracking candidates that are guaranteed to work (because they 
are taken from the latest
+constraints and they already succeeded in the past when the constraints were 
updated).
+
+Then we run ``breeze ci-image build --upgrade-to-newer-dependencies 
--eager-upgrade-additional-requirements "REQUIREMENTS"``
+to check which of the candidates causes the long builds. Initially you put 
there the whole list of
+candidates that you got from the `find-backtracking-candidates` command. This 
**should** succeed. Now,

Review Comment:
   ```suggestion
   Then, we run ``breeze ci-image build --upgrade-to-newer-dependencies 
--eager-upgrade-additional-requirements "REQUIREMENTS"``
   to check which of the candidates causes the long builds. Initially, you put 
there the whole list of
   candidates that you got from the `find-backtracking-candidates` command. 
This **should** succeed. Now,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to