On Wed, Mar 18, 2026 at 6:45 PM Richard Purdie via lists.openembedded.org
<[email protected]> wrote:

> Hi,
>
> On Mon, 2026-03-09 at 12:57 +0100, Benjamin Robin via
> lists.openembedded.org wrote:
> > This series is an RFC and a follow-up to patch 6/6 ("Add class for
> > post-build CVE analysis"), which was previously discussed [1].
> > I have prepared two RFC series, this one and another, each exploring
> > different approaches to handling the download of CVE databases.
> >
> > I explored using BitBake's internal fetcher instead of direct Git calls
> > for fetching CVE databases. However, I encountered two major issues:
> >
> > - No proper shallow clone support: I wanted to clone the repository
> >   without downloading the entire history (which is very large). While
> >   `BB_GIT_SHALLOW` exists, it creates multiple tarballs in the download
> >   directory, which is inefficient for updates.
> >
> >   In this series, we are going to do a full clone of the git repository,
> >   so this point is not going to be fixed.
> >
> > - Performance overhead for CVE databases deployment: The recipes
> >   downloading CVE databases must copy them to the sysroot or to the
> >   deploy directory. This requires copying the extracted databases
> >   multiple times, even with hard links, which is slow due to the
> >   combined size (~6 GB, ~672,000 small files).
> >
> >   In this series, we are using a custom deploy task that is going to
> >   copy the git repository using rsync directly in the final deploy
> >   directory, by-passing all the Bitbake logic.
> >
> > Additionally, there's no built-in way to control the interval between
> > CVE database fetches: In this series, we are going to use AUTOREV,
> > which imply to query the git repositories for each build, to check if
> > there is a new git revision.
> >
> > Moreover, this series ensures that the CVE analysis runs only when
> > the original SBOM changes or when the CVE databases are updated.
> >
> > Upon revisiting the class and its associated recipes, I identified
> > several areas for improvement, which were fixed in the first commit.
> > This series also includes a second commit making the VEX class optional
> > rather than mandatory.
> >
> > [1]
> https://lore.kernel.org/all/[email protected]/
>
> I've just been trying to work out where we're at with this coming up to
> release and we need to get this resolved.
>
> I feel quite strongly that we need to use the fetcher for obtaining
> this data. "fetching" isn't trivial and is full of
> license/auditing/sbom issues. Making any exception to that, even for
> cve data tends to become problematic later.
>
> The existing approach was only done as it was a sqlite database and we
> didn't have fetcher support for such a thing. If we need to improve the
> git fetcher in some way to better support this use case (e.g. shallow
> clone update efficiency), that is something we can work on.
>
> As such, I was wondering if you had never versions of these patches?
>
> I'd note that we can't set AUTOREV by default, we'll need to specify a
> revision, and document how the user can enable AUTOREV in their config
> (maybe even a config fragment?). Whilst it is annoying to do that, it
> is a requirement that the system doesn't touch the network outside
> mirrors unless configured to.
>
>
Fetching the complete git repos has a number of problems. Why not use
release
tarballs like those in  https://github.com/CVEProject/cvelistV5/releases ?
Fkie feeds also have them
https://github.com/fkie-cad/nvd-json-data-feeds/releases

CVE versions of those repositories are good for manual analysis, but a
simple
check does not need all of that.

Also, I'm worried about the size explosion with additional databases that
will be
needed in the 1-2 years time period. I also wouldn't assume all of them
will have
git mirrors.

For an analysis I think it would be better to integrate sources in a
database,
but not a relational one (like it was done with sqlite). An object database
corresponds
better to what the data contains.

Kind regards,
Marta
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#233464): 
https://lists.openembedded.org/g/openembedded-core/message/233464
Mute This Topic: https://lists.openembedded.org/mt/118219723/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to