The GitHub Actions job "Tests" on airflow.git/fix/git-bundle-clone-per-task has 
failed.
Run started by GitHub user Arunodoy18 (triggered by Arunodoy18).

Head commit for run:
9d1f4fe58b2030db80009641e5ae246fccf519b7 / Arunodoy18 <[email protected]>
Fix: Preserve .git in versioned GitDagBundles to enable safe reuse

Root Cause:
Versioned bundles (using versions/{SHA}/ directories) are designed to be
shared across concurrent tasks to avoid redundant cloning. However, when
prune_dotgit_folder=True (the default), the .git folder was removed after
initialization, breaking subsequent reuse attempts:

1. Task A initializes version ABC123, prunes .git
2. Task B tries to reuse ABC123, validation fails (not a git repo)
3. Task B deletes and re-clones entire version directory
4. Race conditions and performance degradation ensue

Evidence: ALL existing tests for versioned bundles explicitly set
prune_dotgit_folder=False, indicating maintainers understood this
constraint but didn't enforce it in the code.

Fix:
Only prune non-versioned (tracking) bundles. Versioned bundles must keep
.git to enable:
- Safe validation via Repo(path) before reuse
- Shared version directories across concurrent tasks
- Elimination of unnecessary re-clones

The --local clone strategy already uses hardlinks to the bare repo, so
disk usage impact is minimal despite keeping .git intact.

Changes:
- Modified prune condition: \if prune_dotgit_folder and not self.version\
- Added debug logging to explain prune decisions
- Added test verifying versioned bundles keep .git with default settings
- Updated docstring to document design rationale

Design Alignment:
This fix aligns with Airflow's KISS principle - minimal invasive change
that addresses the root cause rather than working around symptoms. The
solution enforces what test patterns already demonstrated was correct.

Backward Compatibility:
Fully compatible - tracking repos (no version) still get pruned as before.
Versioned repos now work correctly with the default prune_dotgit_folder=True
setting that users expect.

Report URL: https://github.com/apache/airflow/actions/runs/21726933781

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to