On Tue, Apr 08, 2025 at 07:24:07PM +0200, Sylvain Beucler wrote: > Hi, Hi Sylvain,
> On 07/04/2025 13:06, Adrian Bunk wrote: > > On Sun, Apr 06, 2025 at 07:33:22PM +0200, Bastien Roucaries wrote: > > > Le dimanche 6 avril 2025, 09:25:58 heure d’été d’Europe centrale Roberto > > > C. > > > Sánchez a écrit : > > > ... > > > > As one example, some time ago I encountered the issue of the size of > > > > data/CVE/list, specifically in the context of a git blame operation > > > > taking a few hours to complete. I became convinced that data/CVE/list > > > > needs to be split. As I've done some research on the topic, the answer > > > > to that is far from clear. I'm less convinced now that "split > > > > data/CVE/list" is the de facto right solution, and I'm definitely > > > > convinced that a big change here will not be accepted without many good > > > > reasons and proof that doesn't also include some massive drawbacks. > > > > > > split per year will help here. > > > ... > > > > Which is not easy, see > > https://salsa.debian.org/security-tracker-team/security-tracker-service/-/issues/1 > > Back then I put together a git-filter-branch rewrite&subdir of > security-tracker/data/CVE/, to isolate triaged CVEs per-file and allow > near-instant history/blame for any specific CVE, e.g.: > $ gitk 2025/31115 > > https://lists.debian.org/debian-lts/2020/10/msg00017.html > https://lists.debian.org/debian-lts/2020/10/pngYP1m7tAWfw.png < > It took a day to run IIRC, it probably would take much longer now as > data/CVE/list more than doubled and gets slower to process. > Nobody gave a damn and I eventually removed it ¯\_(ツ)_/¯ >... Roberto asking for "pain points" is the right approach here. #908678 was 7 years ago about the security tracker repo being a real pain for the salsa admins. Apparently gitlab is now OKish with that, but that could be double-checked with the salsa admins. "git blame data/CVE/list" takes ~ 10 minutes (not a few hours) on reasonable current hardware, which is not nice but bearable. Making what you did available again and more well-known might be the best approach if "git blame" is the only problem - it offers a solution for this specific problem without a huge rearchitecture. > Cheers! > Sylvain cu Adrian
