Hi Paul,

> Subject: RE: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> SRC_URI entries
> 
> Hi Paul,
> 
> > Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from
> > multiple SRC_URI entries
> >
> > On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > > When a recipe uses multiple git SRC_URI entries with different
> > > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > > the kernel, modules, and application), each source is unpacked into
> > > a subdirectory of EXTERNALSRC that retains its own .git directory.
> > >
> > > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > > which fails with exit code 128 when git encounters these
> > > unregistered nested git repositories, halting the bitbake parse phase.
> >
> > Is this true? The documentation for `git add` [1] talks about issuing
> > a warning when this occurs, not an error, and in some quick local
> > testing I get a successful exit (exit code 0) when I try this.
> >
> > > Fix by scanning for nested git repos before the add. If any are
> > > found, exclude them from the top-level 'git add' using pathspec
> > > magic ':(exclude)<path>' and hash each nested repo independently
> > > using a temporary index. This ensures changes in any nested repo
> > > still trigger do_compile/do_configure to re-run.
> > >
> > > Signed-off-by: Jamin Lin <[email protected]>
> > > ---
> > >  meta/classes/externalsrc.bbclass | 37
> > > +++++++++++++++++++++++++++++++-
> > >  1 file changed, 36 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/meta/classes/externalsrc.bbclass
> > > b/meta/classes/externalsrc.bbclass
> > > index 902ff2604f..0dd57af668 100644
> > > --- a/meta/classes/externalsrc.bbclass
> > > +++ b/meta/classes/externalsrc.bbclass
> > > @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> > >              # Update our custom index
> > >              env = os.environ.copy()
> > >              env['GIT_INDEX_FILE'] = tmp_index.name
> > > -            subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir,
> > env=env)
> > > +            # Find nested git repos created by multiple SRC_URI git
> > entries with
> > > +            # different destsuffix values. git add -A . exits 128
> > > + when it
> > encounters
> > > +            # these unregistered nested repos.
> > > +            nested_git_dirs = []
> > > +            for root, dirs, files in os.walk(s_dir):
> > > +                if root == s_dir:
> > > +                    continue
> > > +                if '.git' in dirs or '.git' in files:
> > > +                    nested_git_dirs.append(root)
> > > +                    dirs[:] = []  # don't recurse into nested repos
> >
> > This os.walk() loop is expensive, is there an alternative way to handle 
> > this?
> >
> os.scandir() was considered but rejected: destsuffix allows arbitrarily deep
> paths (e.g. destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m), so a
> depth-1 scan would silently miss nested repos and stop hashing their content —
> source changes there would no longer trigger do_compile to re-run.
> 
> The os.walk() loop uses dirs[:] = [] to stop recursing as soon as a .git 
> entry is
> found, so we never descend into the nested repos themselves (which may
> contain tens of thousands of files). The walk only traverses the shallow 
> skeleton
> of intermediate directories between EXTERNALSRC and each nested repo.
> 
> Please see the use case here:
> https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel
> /zephyr-kernel/zephyr-kernel-src-4.3.0.inc
> 
> ${SRC_URI_ZEPHYR_OPEN_AMP};name=open-amp;nobranch=1;destsuffix=${P}
> /modules/lib/open-amp \
> ${SRC_URI_ZEPHYR_TRUSTED_FIRMWARE_M};name=trusted-firmware-m;nob
> ranch=1;destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m \
> 
> Thanks-Jamin
> 
> > The code has also become difficult to parse. My rule of thumb is that
> > if a group of lines needs a leading comment, it also needs an empty
> > line before the comment to visually separate things.
> >
> > > +            if nested_git_dirs:
> > > +                excludes = [':(exclude)' + os.path.relpath(n,
> > > + s_dir) for n
> > in nested_git_dirs]
> > > +                subprocess.check_output(['git', 'add', '-A', '.'] +
> > > + excludes,
> > cwd=s_dir, env=env)
> > > +            else:
> > > +                subprocess.check_output(['git', 'add', '-A', '.'],
> > > + cwd=s_dir, env=env)
> >
> > To simplify the code, construct a cmd variable and call
> > subprocess.check_output(cmd, ...) once.
> >
> > >              git_sha1 = subprocess.check_output(['git',
> > > 'write-tree'], cwd=s_dir, env=env).decode("utf-8")
> > > +            # Hash each nested git repo separately so source
> > > + changes
> > there still
> > > +            # trigger do_compile/do_configure to re-run.
> > > +            for nested in nested_git_dirs:
> > > +                nested_git = os.path.join(nested, '.git')
> > > +                if not os.path.isdir(nested_git):
> > > +                    continue
> > > +                with
> > tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as
> > nested_tmp:
> > > +                    nested_index = os.path.join(nested_git, 'index')
> > > +                    if os.path.exists(nested_index):
> > > +                        shutil.copyfile(nested_index,
> > nested_tmp.name)
> > > +                    nested_env = os.environ.copy()
> > > +                    nested_env['GIT_INDEX_FILE'] =
> nested_tmp.name
> > > +                    proc = subprocess.Popen(['git', 'add', '-A',
> > > + '.'],
> > cwd=nested,
> > > +                                           env=nested_env,
> > stdout=subprocess.DEVNULL,
> > > +
> > stderr=subprocess.DEVNULL)
> > > +                    proc.communicate()
> > > +                    proc = subprocess.Popen(['git', 'write-tree'],
> > cwd=nested,
> > > +                                           env=nested_env,
> > stdout=subprocess.PIPE,
> > > +
> > stderr=subprocess.DEVNULL)
> > > +                    stdout, _ = proc.communicate()
> > > +                    git_sha1 += stdout.decode("utf-8")
> >
> > We should re-use the code from the following block which handles
> > submodules instead of re-implementing the behaviour. Perhaps the
> > common code needs to be refactored out.
> >
> > Best regards,
> >
> > --
> > Paul Barker

Thanks for the review and suggestion. 

Will send v2 as below.

```
diff --git a/meta/classes/externalsrc.bbclass b/meta/classes/externalsrc.bbclass
index 902ff2604f..f2e0812eea 100644
--- a/meta/classes/externalsrc.bbclass
+++ b/meta/classes/externalsrc.bbclass
@@ -234,18 +234,47 @@ def srctree_hash_files(d, srcdir=None):
             # Update our custom index
             env = os.environ.copy()
             env['GIT_INDEX_FILE'] = tmp_index.name
-            subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir, 
env=env)
+
+            # Find nested git repos created by multiple SRC_URI git entries 
with
+            # different destsuffix values. When GIT_INDEX_FILE is set, git add 
-A .
+            # exits 128 instead of warning when it encounters these 
unregistered
+            # nested repos, halting the bitbake parse phase.
+            nested_git_dirs = []
+            for root, dirs, files in os.walk(s_dir):
+                if root == s_dir:
+                    continue
+                if '.git' in dirs or '.git' in files:
+                    nested_git_dirs.append(root)
+                    dirs[:] = []
+
+            cmd = ['git', 'add', '-A', '.']
+            if nested_git_dirs:
+                cmd += [':(exclude)' + os.path.relpath(n, s_dir) for n in 
nested_git_dirs]
+            subprocess.check_output(cmd, cwd=s_dir, env=env)
+
             git_sha1 = subprocess.check_output(['git', 'write-tree'], 
cwd=s_dir, env=env).decode("utf-8")
-            if os.path.exists(os.path.join(s_dir, ".gitmodules")) and 
os.path.getsize(os.path.join(s_dir, ".gitmodules")) > 0:
+
+            # Hash nested git repos and submodules together so changes in any 
of
+            # them still trigger do_compile/do_configure to re-run.
+            subdirs_to_hash = list(nested_git_dirs)
+            if os.path.exists(os.path.join(s_dir, ".gitmodules")) and \
+                    os.path.getsize(os.path.join(s_dir, ".gitmodules")) > 0:
                 submodule_helper = subprocess.check_output(["git", "config", 
"--file", ".gitmodules", "--get-regexp", "path"], cwd=s_dir, 
env=env).decode("utf-8")
                 for line in submodule_helper.splitlines():
                     module_dir = os.path.join(s_dir, 
line.rsplit(maxsplit=1)[1])
                     if os.path.isdir(module_dir):
-                        proc = subprocess.Popen(['git', 'add', '-A', '.'], 
cwd=module_dir, env=env, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
-                        proc.communicate()
-                        proc = subprocess.Popen(['git', 'write-tree'], 
cwd=module_dir, env=env, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
-                        stdout, _ = proc.communicate()
-                        git_sha1 += stdout.decode("utf-8")
+                        subdirs_to_hash.append(module_dir)
+
+            for subdir in subdirs_to_hash:
+                proc = subprocess.Popen(['git', 'add', '-A', '.'], cwd=subdir,
+                                        env=env, stdout=subprocess.DEVNULL,
+                                        stderr=subprocess.DEVNULL)
+                proc.communicate()
+                proc = subprocess.Popen(['git', 'write-tree'], cwd=subdir,
+                                        env=env, stdout=subprocess.PIPE,
+                                        stderr=subprocess.DEVNULL)
+                stdout, _ = proc.communicate()
+                git_sha1 += stdout.decode("utf-8")
             sha1 = hashlib.sha1(git_sha1.encode("utf-8")).hexdigest()
         with open(oe_hash_file, 'w') as fobj:
             fobj.write(sha1)
```

Thanks,
Jamin

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#237297): 
https://lists.openembedded.org/g/openembedded-core/message/237297
Mute This Topic: https://lists.openembedded.org/mt/119327122/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to