Hi Paul,

> Subject: Re: [PATCH v1 1/2] externalsrc: Handle nested git repos from multiple
> SRC_URI entries
> 
> On Fri, 2026-05-15 at 09:36 +0000, Jamin Lin wrote:
> > When a recipe uses multiple git SRC_URI entries with different
> > destsuffix values (e.g. Zephyr-based recipes with separate repos for
> > the kernel, modules, and application), each source is unpacked into a
> > subdirectory of EXTERNALSRC that retains its own .git directory.
> >
> > srctree_hash_files() calls 'git add -A .' at the EXTERNALSRC root,
> > which fails with exit code 128 when git encounters these unregistered
> > nested git repositories, halting the bitbake parse phase.
> 
> Is this true? The documentation for `git add` [1] talks about issuing a 
> warning
> when this occurs, not an error, and in some quick local testing I get a 
> successful
> exit (exit code 0) when I try this.
> 
> > Fix by scanning for nested git repos before the add. If any are found,
> > exclude them from the top-level 'git add' using pathspec magic
> > ':(exclude)<path>' and hash each nested repo independently using a
> > temporary index. This ensures changes in any nested repo still trigger
> > do_compile/do_configure to re-run.
> >
> > Signed-off-by: Jamin Lin <[email protected]>
> > ---
> >  meta/classes/externalsrc.bbclass | 37
> > +++++++++++++++++++++++++++++++-
> >  1 file changed, 36 insertions(+), 1 deletion(-)
> >
> > diff --git a/meta/classes/externalsrc.bbclass
> > b/meta/classes/externalsrc.bbclass
> > index 902ff2604f..0dd57af668 100644
> > --- a/meta/classes/externalsrc.bbclass
> > +++ b/meta/classes/externalsrc.bbclass
> > @@ -234,8 +234,43 @@ def srctree_hash_files(d, srcdir=None):
> >              # Update our custom index
> >              env = os.environ.copy()
> >              env['GIT_INDEX_FILE'] = tmp_index.name
> > -            subprocess.check_output(['git', 'add', '-A', '.'], cwd=s_dir,
> env=env)
> > +            # Find nested git repos created by multiple SRC_URI git
> entries with
> > +            # different destsuffix values. git add -A . exits 128 when it
> encounters
> > +            # these unregistered nested repos.
> > +            nested_git_dirs = []
> > +            for root, dirs, files in os.walk(s_dir):
> > +                if root == s_dir:
> > +                    continue
> > +                if '.git' in dirs or '.git' in files:
> > +                    nested_git_dirs.append(root)
> > +                    dirs[:] = []  # don't recurse into nested repos
> 
> This os.walk() loop is expensive, is there an alternative way to handle this?
> 
os.scandir() was considered but rejected: destsuffix allows arbitrarily
deep paths (e.g. destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m), so a 
depth-1 scan would
silently miss nested repos and stop hashing their content — source
changes there would no longer trigger do_compile to re-run.

The os.walk() loop uses dirs[:] = [] to stop recursing as soon as a
.git entry is found, so we never descend into the nested repos
themselves (which may contain tens of thousands of files). The walk only
traverses the shallow skeleton of intermediate directories between
EXTERNALSRC and each nested repo.

Please see the use case here:
https://git.yoctoproject.org/meta-zephyr/tree/meta-zephyr-core/recipes-kernel/zephyr-kernel/zephyr-kernel-src-4.3.0.inc

${SRC_URI_ZEPHYR_OPEN_AMP};name=open-amp;nobranch=1;destsuffix=${P}/modules/lib/open-amp
 \
${SRC_URI_ZEPHYR_TRUSTED_FIRMWARE_M};name=trusted-firmware-m;nobranch=1;destsuffix=${P}/modules/tee/tf-m/trusted-firmware-m
 \

Thanks-Jamin

> The code has also become difficult to parse. My rule of thumb is that if a 
> group
> of lines needs a leading comment, it also needs an empty line before the
> comment to visually separate things.
> 
> > +            if nested_git_dirs:
> > +                excludes = [':(exclude)' + os.path.relpath(n, s_dir) for n
> in nested_git_dirs]
> > +                subprocess.check_output(['git', 'add', '-A', '.'] + 
> > excludes,
> cwd=s_dir, env=env)
> > +            else:
> > +                subprocess.check_output(['git', 'add', '-A', '.'],
> > + cwd=s_dir, env=env)
> 
> To simplify the code, construct a cmd variable and call
> subprocess.check_output(cmd, ...) once.
> 
> >              git_sha1 = subprocess.check_output(['git', 'write-tree'],
> > cwd=s_dir, env=env).decode("utf-8")
> > +            # Hash each nested git repo separately so source changes
> there still
> > +            # trigger do_compile/do_configure to re-run.
> > +            for nested in nested_git_dirs:
> > +                nested_git = os.path.join(nested, '.git')
> > +                if not os.path.isdir(nested_git):
> > +                    continue
> > +                with
> tempfile.NamedTemporaryFile(prefix='oe-devtool-nested-index') as
> nested_tmp:
> > +                    nested_index = os.path.join(nested_git, 'index')
> > +                    if os.path.exists(nested_index):
> > +                        shutil.copyfile(nested_index,
> nested_tmp.name)
> > +                    nested_env = os.environ.copy()
> > +                    nested_env['GIT_INDEX_FILE'] = nested_tmp.name
> > +                    proc = subprocess.Popen(['git', 'add', '-A', '.'],
> cwd=nested,
> > +                                           env=nested_env,
> stdout=subprocess.DEVNULL,
> > +
> stderr=subprocess.DEVNULL)
> > +                    proc.communicate()
> > +                    proc = subprocess.Popen(['git', 'write-tree'],
> cwd=nested,
> > +                                           env=nested_env,
> stdout=subprocess.PIPE,
> > +
> stderr=subprocess.DEVNULL)
> > +                    stdout, _ = proc.communicate()
> > +                    git_sha1 += stdout.decode("utf-8")
> 
> We should re-use the code from the following block which handles submodules
> instead of re-implementing the behaviour. Perhaps the common code needs to
> be refactored out.
> 
> Best regards,
> 
> --
> Paul Barker

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#237273): 
https://lists.openembedded.org/g/openembedded-core/message/237273
Mute This Topic: https://lists.openembedded.org/mt/119327122/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to