Hi,I'm opposed to this idea not because it's AI, but because it's misapplied and likely to be really costly with a non-deterministic result.
What you are proposing involves the AUR to start checking every single commit via a DevSecOps CI/CD SSDLC style pipeline, but only including one tool (homebrewed AI solution).
It would be better to set up proper security tooling, or even integrate 3rd party malware detection services.
Reinventing SAST and DAST scanners here via selfhosted or costly 3rd party LLM providers seems like a very wrong approach.
A solution here should focus on clustering and detection of compromised/malicious accounts for rapid takedowns or a moderation hold, prevention of malware being included by blacklisting known IOCs in packages, lastly proper automated scanning of existing packages for malicious activity.
AI and LLMs have a role in supporting those three tasks as portions of the tools, not as the solution itself.
You are welcome to scan the AUR currently and make "bot comments" on the maliciousness of packages, but I think any Arch hosted or developed solution should stay away from spinning up such expensive infrastructure to reinvent the wheel.
Regards, Shyamin Ayesh:
Hello Everyone, I know this is going to be a controversial idea, and I'm not much of a writer, so bear with me here. I've been noticing the recent wave of spam packages and malicious code submissions hitting the AUR lately. It's getting worse, and the current manual review process clearly doesn't scale. So here's my possibly unpopular suggestion: *what if we used LLMs as a first-pass filter for AUR submissions?* *The basic idea:* - When a PKGBUILD or install script gets submitted, an LLM scans it for sketchy stuff like obfuscated code, curl pipes to random endpoints, crypto miners, encoded payloads, that kind of thing. - It doesn't replace human review. It just flags the suspicious ones so reviewers know where to look first. - Unlike regex-based scanners, LLMs can actually understand code intent. They can catch things like subtle dependency hijacking or weird post-install behavior that static tools would miss. - Flagged packages go into a queue with the LLM's reasoning attached, not just "blocked" but why it thinks something is off. I get it, there are real concerns. False positives, inference costs, and honestly just the idea of putting AI anywhere near the trust pipeline. But I'm not saying replace anything. Just add a layer. Could be a server-side hook on submission, or a community bot that comments on new packages. I'm happy to help build a prototype if anyone's interested. I know some of you are going to hate this idea, and that's fine. But the spam problem is real and getting worse, so I figured it's worth putting out there. Open to better ideas too. Cheers*,* Shyamin
OpenPGP_0x45E5F8C1504CDA42.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature
