Re: [Idea] AUR vetting process using LLM models

kpcyrd Fri, 29 May 2026 07:56:39 -0700

On 5/29/26 8:22 AM, Shyamin Ayesh wrote:

So here's my possibly unpopular suggestion: *what if we used LLMs as a
first-pass filter for AUR submissions?*


*The basic idea:*
   - When a PKGBUILD or install script gets submitted, an LLM scans it for
sketchy stuff like obfuscated code, curl pipes to random endpoints, crypto
miners, encoded payloads, that kind of thing.
   - It doesn't replace human review. It just flags the suspicious ones so
reviewers know where to look first.
   - Unlike regex-based scanners, LLMs can actually understand code intent.
They can catch things like subtle dependency hijacking or weird
post-install behavior that static tools would miss.
   - Flagged packages go into a queue with the LLM's reasoning attached, not
just "blocked" but why it thinks something is off.

I get it, there are real concerns. False positives, inference costs, and
honestly just the idea of putting AI anywhere near the trust pipeline. But
I'm not saying replace anything. Just add a layer. Could be a server-side
hook on submission, or a community bot that comments on new packages. I'm
happy to help build a prototype if anyone's interested.

I know some of you are going to hate this idea, and that's fine. But the
spam problem is real and getting worse, so I figured it's worth putting out
there. Open to better ideas too.

There's a public feed of packages being most recently updated, so anybody couldbuild a system like this, no special access or permit needed:


https://aur.archlinux.org/packages?SeB=nd&SB=l&O=0&SO=d

If you can produce valid findings people are going to appreciate and act onthem, but I don't think the Arch Linux org is going to run this, mostly because:


- tokens are expensive, and I don't think donations should be used for this
- actually programming and maintaining this isn't trivial

I believe it could work better than regular anti-virus scanners, the ELFexecutable of the browsh-bin incident was a BPF rootkit fully undetected by allanti-virus vendors, however having `npm install` in a post install hook ishighly unusual and could've been flagged. But speculation about how this wouldor wouldn't work doesn't help much if nobody is actually standing up to buildand run this.

If we do actually have a system in place that can produce high quality findings,we could then look into how this could be integrated (or some Arch staff peoplemay just subscribe to it's RSS feed).


cheers,
kpcyrd

Re: [Idea] AUR vetting process using LLM models

Reply via email to