shahar1 commented on code in PR #42000:
URL: https://github.com/apache/airflow/pull/42000#discussion_r1744998136
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
Review Comment:
```suggestion
Very rarely the CI image in `canary` builds take a very long time to build.
This is usually
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
Review Comment:
```suggestion
to be found compatible with each other. In case new dependencies are
released, `pip` might enter
a long loop trying to figure out if these dependencies can be used.
Unfortunately, this long loop could end up in an error due to conflicts.
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
Review Comment:
```suggestion
We use a "bisecting" methodology to test candidates for backtrack triggering
among those released in PyPI since the last successful run
of ``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints
at some point of time and
+it tries to find out what dependencies have been added since then and limit
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times
out, so the
+easiest way to find backtracking candidates is to find the first build that
failed with timeout - it
Review Comment:
```suggestion
Instead of running the command manually, you could also rely on the failing
CI builds. We run the
`find-backtracking-candidates` command in the `canary` build when it times
out, so the
easiest way to find backtracking candidates is to find the first build that
failed with timeout. It
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints
at some point of time and
+it tries to find out what dependencies have been added since then and limit
them to the version that
+was used in the constraints.
Review Comment:
```suggestion
This command should be run quickly after noticing that the CI build is
taking a long time and failing. It relies on the fact that an eager upgrade
produced valid constraints at some point of time, so it tries to identify which
dependencies have been added since then. By doing it, it limits them to the
versions used in the constraints.
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints
at some point of time and
+it tries to find out what dependencies have been added since then and limit
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times
out, so the
+easiest way to find backtracking candidates is to find the first build that
failed with timeout - it
+will likely have the smallest number of backtracking candidates. The command
outputs the limitation
+for those backtracking candidates that are guaranteed to work (because they
are taken from the latest
+constraints and they already succeeded in the past when the constraints were
updated).
+
+Then we run ``breeze ci-image build --upgrade-to-newer-dependencies
--eager-upgrade-additional-requirements "REQUIREMENTS"``
+to check which of the candidates causes the long builds. Initially you put
there the whole list of
+candidates that you got from the `find-backtracking-candidates` command. This
**should** succeed. Now,
+the next step is to narrow down the list of candidates to the one that is
causing the backtracking.
+
+We narrow-down the list by "bisecting" the list. We remove half of the
dependency limits and see if it
+still works or not. It works - we continue. If it does not work, we restore
the removed half and remove
+the other half. Rinse and repeat until there is only one dependency left -
hopefully
+(sometimes you will need to leave few of them).
Review Comment:
You made me laugh real hard with "Rinse and repeat", but I don't think that
it fits this kind of docs :)
```suggestion
still works or not. If it works, we continue. Otherwise, we restore the
removed half and remove
the other half. Repeat until there is only one dependency left - hopefully
(sometimes you will need to leave few of them).
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
Review Comment:
```suggestion
This is why we need to help pip to skip newer versions of those
dependencies, until the
condition that caused the backtracking is solved.
```
##########
dev/MANUALLY_GENERATING_IMAGE_CACHE_AND_CONSTRAINTS.md:
##########
@@ -532,3 +359,196 @@ breeze release-management update-constraints
--constraints-repo /home/user/airfl
--commit-message "Update pymssql constraint to 2.2.8 and Authlib to 1.3.0"
\
--airflow-constraints-mode constraints
```
+
+
+# Figuring out backtracking dependencies
+
+## Why we need to figure out backtracking dependencies
+
+Sometimes, very rarely the CI image in `canary` builds take a very long time
to build. This is usually
+caused by `pip` trying to figure out the latest set of dependencies (`eager
upgrade`) .
+The resolution of dependencies is a very complex problem and sometimes it
takes a long time to figure out
+the best set of dependencies. This is especially true when we have a lot of
dependencies and they all have
+to be found compatible with each other. In case new dependencies are released,
sometimes `pip` enters
+a long loop trying to figure out if the newly released dependency can be used,
but due to some other
+dependencies of ours it is impossible, but it will take `pip` a very long time
to figure it out.
+
+This is visible in the "build output" as `pip` attempting to continuously
backtrack and download many new
+versions of various dependencies, trying to find a good match.
+
+This is why we sometimes we need to help pip to skip newer versions of those
dependencies, until the
+condition that caused the backtracking is solved.
+
+We do it by adding `dependency<=version` to the
EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS variable in
+`Dockerfile.ci`. The trick is to find the dependency that is causing the
backtracking.
+
+Here is how. We use `bisecting` methodology to try out candidates for
backtrack triggering among the
+candidates that have been released in PyPI since the last time we successfully
run
+``--upgrade-to-newer-dependencies`` and committed the constraints in the
`canary` build.
+
+## How to figure out backtracking dependencies
+
+First - we have a breeze command that can help us with that:
+
+```bash
+breeze ci find-backtracking-candidates
+```
+
+This command should be run rather quickly after we notice that the CI build is
taking a long time and fail,
+because it is based on the fact that eager upgrade produced valid constraints
at some point of time and
+it tries to find out what dependencies have been added since then and limit
them to the version that
+was used in the constraints.
+
+You can also - instead of running the command manually rely on the failing CI
builds. We run the
+`find-backtracking-candidates` command in the `canary` build when it times
out, so the
+easiest way to find backtracking candidates is to find the first build that
failed with timeout - it
+will likely have the smallest number of backtracking candidates. The command
outputs the limitation
+for those backtracking candidates that are guaranteed to work (because they
are taken from the latest
+constraints and they already succeeded in the past when the constraints were
updated).
+
+Then we run ``breeze ci-image build --upgrade-to-newer-dependencies
--eager-upgrade-additional-requirements "REQUIREMENTS"``
+to check which of the candidates causes the long builds. Initially you put
there the whole list of
+candidates that you got from the `find-backtracking-candidates` command. This
**should** succeed. Now,
Review Comment:
```suggestion
Then, we run ``breeze ci-image build --upgrade-to-newer-dependencies
--eager-upgrade-additional-requirements "REQUIREMENTS"``
to check which of the candidates causes the long builds. Initially, you put
there the whole list of
candidates that you got from the `find-backtracking-candidates` command.
This **should** succeed. Now,
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]