Hi,
On Mon, 2026-03-09 at 12:57 +0100, Benjamin Robin via lists.openembedded.org
wrote:
> This series is an RFC and a follow-up to patch 6/6 ("Add class for
> post-build CVE analysis"), which was previously discussed [1].
> I have prepared two RFC series, this one and another, each exploring
> different approaches to handling the download of CVE databases.
>
> I explored using BitBake's internal fetcher instead of direct Git calls
> for fetching CVE databases. However, I encountered two major issues:
>
> - No proper shallow clone support: I wanted to clone the repository
> without downloading the entire history (which is very large). While
> `BB_GIT_SHALLOW` exists, it creates multiple tarballs in the download
> directory, which is inefficient for updates.
>
> In this series, we are going to do a full clone of the git repository,
> so this point is not going to be fixed.
>
> - Performance overhead for CVE databases deployment: The recipes
> downloading CVE databases must copy them to the sysroot or to the
> deploy directory. This requires copying the extracted databases
> multiple times, even with hard links, which is slow due to the
> combined size (~6 GB, ~672,000 small files).
>
> In this series, we are using a custom deploy task that is going to
> copy the git repository using rsync directly in the final deploy
> directory, by-passing all the Bitbake logic.
>
> Additionally, there's no built-in way to control the interval between
> CVE database fetches: In this series, we are going to use AUTOREV,
> which imply to query the git repositories for each build, to check if
> there is a new git revision.
>
> Moreover, this series ensures that the CVE analysis runs only when
> the original SBOM changes or when the CVE databases are updated.
>
> Upon revisiting the class and its associated recipes, I identified
> several areas for improvement, which were fixed in the first commit.
> This series also includes a second commit making the VEX class optional
> rather than mandatory.
>
> [1]
> https://lore.kernel.org/all/[email protected]/
I've just been trying to work out where we're at with this coming up to
release and we need to get this resolved.
I feel quite strongly that we need to use the fetcher for obtaining
this data. "fetching" isn't trivial and is full of
license/auditing/sbom issues. Making any exception to that, even for
cve data tends to become problematic later.
The existing approach was only done as it was a sqlite database and we
didn't have fetcher support for such a thing. If we need to improve the
git fetcher in some way to better support this use case (e.g. shallow
clone update efficiency), that is something we can work on.
As such, I was wondering if you had never versions of these patches?
I'd note that we can't set AUTOREV by default, we'll need to specify a
revision, and document how the user can enable AUTOREV in their config
(maybe even a config fragment?). Whilst it is annoying to do that, it
is a requirement that the system doesn't touch the network outside
mirrors unless configured to.
Cheers,
Richard
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#233433):
https://lists.openembedded.org/g/openembedded-core/message/233433
Mute This Topic: https://lists.openembedded.org/mt/118219723/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-