Hi,

On Mon, 2026-03-09 at 12:57 +0100, Benjamin Robin via lists.openembedded.org 
wrote:
> This series is an RFC and a follow-up to patch 6/6 ("Add class for
> post-build CVE analysis"), which was previously discussed [1].
> I have prepared two RFC series, this one and another, each exploring
> different approaches to handling the download of CVE databases.
> 
> I explored using BitBake's internal fetcher instead of direct Git calls
> for fetching CVE databases. However, I encountered two major issues:
> 
> - No proper shallow clone support: I wanted to clone the repository
>   without downloading the entire history (which is very large). While
>   `BB_GIT_SHALLOW` exists, it creates multiple tarballs in the download
>   directory, which is inefficient for updates.
> 
>   In this series, we are going to do a full clone of the git repository,
>   so this point is not going to be fixed.
> 
> - Performance overhead for CVE databases deployment: The recipes
>   downloading CVE databases must copy them to the sysroot or to the
>   deploy directory. This requires copying the extracted databases
>   multiple times, even with hard links, which is slow due to the
>   combined size (~6 GB, ~672,000 small files).
> 
>   In this series, we are using a custom deploy task that is going to
>   copy the git repository using rsync directly in the final deploy
>   directory, by-passing all the Bitbake logic.
> 
> Additionally, there's no built-in way to control the interval between
> CVE database fetches: In this series, we are going to use AUTOREV,
> which imply to query the git repositories for each build, to check if
> there is a new git revision.
> 
> Moreover, this series ensures that the CVE analysis runs only when
> the original SBOM changes or when the CVE databases are updated.
> 
> Upon revisiting the class and its associated recipes, I identified
> several areas for improvement, which were fixed in the first commit.
> This series also includes a second commit making the VEX class optional
> rather than mandatory.
> 
> [1] 
> https://lore.kernel.org/all/[email protected]/

I've just been trying to work out where we're at with this coming up to
release and we need to get this resolved.

I feel quite strongly that we need to use the fetcher for obtaining
this data. "fetching" isn't trivial and is full of
license/auditing/sbom issues. Making any exception to that, even for
cve data tends to become problematic later.

The existing approach was only done as it was a sqlite database and we
didn't have fetcher support for such a thing. If we need to improve the
git fetcher in some way to better support this use case (e.g. shallow
clone update efficiency), that is something we can work on.

As such, I was wondering if you had never versions of these patches?

I'd note that we can't set AUTOREV by default, we'll need to specify a
revision, and document how the user can enable AUTOREV in their config
(maybe even a config fragment?). Whilst it is annoying to do that, it
is a requirement that the system doesn't touch the network outside
mirrors unless configured to.

Cheers,

Richard








-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#233433): 
https://lists.openembedded.org/g/openembedded-core/message/233433
Mute This Topic: https://lists.openembedded.org/mt/118219723/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to