On 5/29/26 2:56 AM, [email protected] wrote:
Pretty good idea you got there, however there's a legitimate concern on where and how you are going to be running it.
For instance, which model to use? Gemini? Deepseek? Open-sourced ChatGPT?
Secondly, where are we gonna get the money for that? Donations are enough for everything up to this point, but running an LLM will shoot costs through the roof. I suggest either distilling or training a separate algorithm (**NOT** a language model) to keep costs low, as it won't be a generalist and we won't be wasting RAM on storing parameters about medicine in a software project.

That's a really good idea, and practical concerns. But, since PKGBUILD files are just bash scripts and we could probably suss-out a fairly rigorous scan process given the brain-trust on this list, it may not require a frontier-model LLM or its cost.

Why not come up with a set of criteria, and turn those into sed, awk, etc. tests to capture suspicious submissions that could simply be self-hosted and run for each new account/submission. All a LLM is going to do is take prompts that tell it to go put those things together and then run it. (give or take). Some prompt wizard could go see what the models will spit out when told to go generate an efficient set of scripts to test the criteria, and see what it does.

A self-hosted tool that is 95% as good as a LLM with zero cost, that scales, seems like the best of both worlds.

The second part of that is moderator or trusted user triage of any positives identified. That has to scale as well (hopefully not too much), but we don't want to take the moderators away from the job they do otherwise. Putting that out to the community in StackOverflow "Queues" type format may be an option to get member involvement and that could be open for users with X number of years/months without a lot of trouble.

I don't mind helping either from a queue sense or tool standpoint, but what we need to do is start gathering together the criteria that needs to be checked, etc.. Starting with the postmortem from the recent attempts. I don't know how they would be integrated to run at the AUR level, but I can turn criteria into testable script fragments to help.

Something similar could be done for AUR account/package ownership changes, adoptions, etc...

Focusing on the actual package-problem and eliminating the account identity baggage is a good idea, and a homegrown solution never a bad choice.


--
David C. Rankin, J.D.,P.E.

Reply via email to