Copilot commented on code in PR #67047:
URL: https://github.com/apache/airflow/pull/67047#discussion_r3255110462


##########
providers/git/tests/unit/git/bundles/test_git.py:
##########
@@ -839,6 +840,39 @@ def test_subdir(self, mock_githook, git_repo):
         assert str(bundle.path).endswith(subdir)
         assert {"some_new_file.py"} == files_in_repo
 
+    @mock.patch("airflow.providers.git.bundles.git.GitHook")
+    def test_sparse_checkout(self, mock_githook, git_repo):
+        repo_path, repo = git_repo
+        mock_githook.return_value.repo_url = repo_path
+
+        subdir = "some/subdir"
+        subdir_path = repo_path / subdir
+        subdir_path.mkdir(parents=True)
+        file_path = subdir_path / "some_relevant_file.py"
+        with open(file_path, "w") as f:
+            f.write("hello world")
+        otherdir = "other/dir"
+        otherdir_path = repo_path / otherdir
+        otherdir_path.mkdir(parents=True)
+        otherfile_path = otherdir_path / "some_other_file.py"
+        with open(otherfile_path, "w") as f:
+            f.write("hello world")
+
+        repo.index.add([file_path, otherfile_path])
+        repo.index.commit("Other commit")
+
+        bundle = GitDagBundle(
+            name="test-sparse",
+            git_conn_id=CONN_HTTPS,
+            tracking_ref=GIT_DEFAULT_BRANCH,
+            sparse_dirs=[subdir],
+        )
+        bundle.initialize()
+
+        files_in_repo = {f.name for f in bundle.path.glob("**/*.py") if 
f.is_file()}
+        assert "some_other_file.py" not in files_in_repo
+        assert "some_relevant_file.py" in files_in_repo
+

Review Comment:
   The new test only covers the fresh, unversioned initial clone path. The 
bundle has additional execution paths affected by sparse checkout that are not 
exercised: (1) `self.version` is set, in which case `_initialize` creates a 
separate worktree under `versions_dir/<version>` and does 
`head.reset(index=True, working_tree=True)`, and (2) `prune_dotgit_folder=True` 
(the default), which deletes `.git` after checkout. Consider adding at least 
one test that combines `sparse_dirs` with a pinned `version` and/or with 
`prune_dotgit_folder=True` to guard against regressions in those paths.
   



##########
providers/git/docs/bundles/index.rst:
##########
@@ -34,9 +34,10 @@ Example of using the GitDagBundle:
          "kwargs": {
              "subdir": "dags",
              "tracking_ref": "main",
-             "refresh_interval": 3600
+             "refresh_interval": 3600,
              "submodules": False,

Review Comment:
   The example is wrapped in a JSON-formatted shell environment variable 
(`AIRFLOW__DAG_PROCESSOR__DAG_BUNDLE_CONFIG_LIST='[ ... ]'`), so booleans must 
be lowercase JSON literals. This change correctly switches `True` → `true` for 
`prune_dotgit_folder`, but the adjacent `"submodules": False` is left as a 
Python-style capitalized boolean, making the snippet still invalid JSON. Please 
also lowercase `False` → `false` on line 38 so the documented example actually 
parses.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to