potiuk commented on PR #34809:
URL: https://github.com/apache/airflow/pull/34809#issuecomment-1774156417

   Guessing what could happen here (and those are just guesses looking at the 
output)
   
   I think the problem is that you have changes in both "airflow" and "celery" 
and other packages that refer to each other in both directions. 
   
   The way how docs building is done is that it builds each package separately 
and uses the "inventory" from other packages (downloaded from s3) it refers in 
order to verify if the document exist. Once the package successfully builds the 
docs, the downloaded inventory gets updated locally.
   
   In most cases where there are two packages and only one of them refer to the 
other document, this document building has multi-pass implemented. If the 
package fails to be built it is added to the queue and retried again (up to 
three times in case the dependencies are A -> B -> C (so if A links to B and B 
links to C in the first pass C succeeds, in the second B will succeed as it 
will use locally built inventory from C and in the third A will finaly succeed 
using the locally built inventory from B.
   
   The problem is that if those packages are build together and they are 
referring to each other's new documents - none of them can succed - because 
they are referring to each other's documents, and those documents do not exist 
in the remote inventory.
   
   Looking at the output - it almost worked:
   
   
![image](https://github.com/apache/airflow/assets/595491/e8287de5-4894-4053-9d81-c019f738fc90)
   
   You can see that at first pass airflow + cncf.kubernetes + celery + dask 
failed docs building 
   
   At the second pass the three others succeeded, only airflow was left.
   
   Unfortunately the third pass on Airflow failed.
   
   The error you see:
   
   
![image](https://github.com/apache/airflow/assets/595491/545b3b40-5c28-4330-ab13-0db0ba7f1944)
   
   Is that API documents could not be found in Airflow - most likely because 
the "clean" step deleted it and for some reason they were not recreated by the 
API plugin. So maybe we can attempt to remove the clean step between the 
retries.
   
   Suggestion how to fix it (if my guess is right):
   
   Only clean the docs when you start and not when you retry. Might be a good 
exercise to learn how the build process works.
   
   As i understand it (again by looking at the code - I modified it quite a few 
times but mostly when I saw similar issues):
   
   In build_docs.py : 
   
   The  `build_docs_for_packages` is executed several times to make the 
multi-pass works
   
   * The first time it should be called with parameter to run `cleanup`
   * all the other times `retry_building_docs_if_needed` it should be called 
without running cleanup
   
   That could probably solve the problem.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to