potiuk commented on PR #43408:
URL: https://github.com/apache/airflow/pull/43408#issuecomment-2439998942

   Hello @koustreak  - could you please explain how are you going to provide 
synchronization between multiple repos?
   
   One of the functionalities of git-sync is that it performs atomic swap when 
new commits are synced - it keeps the old repository, while it checks-out the 
new one and then atomically swaps the new folder with the old one via symbolic 
link.
   
   This is an important feature of git-sync, because it guarantees that DAG 
parsing across multiple folders inside the repo is consistent - and uses always 
the same single-consistent version of the DAG files and any related imports.
   
   This is one of the reasons why both git-sync and us are reluctant to manage 
and sync multiple repositories via "external means" - because you can already 
do it via "submodules" - and then the consistency and atomic swap is maintained 
- while you get  a good way of keeping references to all the updated repos and 
even reverting and backing individual repos to previous versions in the single 
"umbrella repo" using submodules.
   
   With the approach you propose, not only you have multiple init containers, 
and running containers with git-sync, but also they have somewhat unpredictible 
schedule - and it's very easy to imagine that checked out repositories: a and b 
will be checked in an inconsistent state (say repository a still at previous 
commit, and repository b at new commit) 
   
   Imagine "common" repository and "dag_team_a" repository - and `dag_team_a` 
importing `common.util.function` - and comon.util.function adding new mandatory 
parameter - when you synchronize each repository independently, 
common.util.function might be already updated, when 'dag_team_a" importing it, 
still uses the old function without the mandatory parameter. This will fail 
parsing of dags in `dag_team_a`.
   
   This is perfectly manageable with having a single repository with submodules 
where you update to the right versions of your
   common repos and commit those changes together.
   
   See https://www.youtube.com/watch?v=uA-8Lj1RNgw - the talk by Anum from 
Jagex explaining how they are doing it.
   
   How are you going to deal with this scenario? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to