Simon Richter <[email protected]> writes:

> But: we certainly cannot say the same for everyone who wants to
> contribute, so we need either

> 1. a stronger vetting process before giving people upload permission, or
> 2. a review process by which any contribution must be approved by other
> developers.

I think part of this discussion, and the point that I think people are
going to need some convincing on, is whether the third and existing option
will work: Treating LLM-based errors the same as any other errors for any
other reason, tackling them reactively, and taking some sort of protective
action if we get too many errors from the same contributor.

I understand the concern that this won't be adequate because the potential
volume of contributions using LLMs is so much higher, but do we have
evidence yet that it won't be adequate? What evidence do we have that we
have to handle LLM-based errors differently than other errors?

I'm not saying there is no evidence, or that there are no good arguments
here, only that I think this is the place where those arguments need to be
made. If LLM use in contributions is a quality problem that is
sufficiently large that it is going to cause disruptive problems unless we
take some sort of action, that's a good argument for taking some sort of
action, but also presumably it would be throwing off early warning signs
that we can point to.

Both 1 and 2 are pretty intrusive and obnoxious changes to our workflows.
There's going to be a lot of reluctance about making such a change unless
we have to.

> "I can use this technology in a responsible way, so it needs to be
> allowed" is not a policy, and neither are "you can't stop me anyway!" or
> "any attempt to create policy will lead to [absurd outcome]."

I agree with all of this, for the record. Somewhat by definition, if LLMs
are causing significant problems, we will be able to see that and take
action against the contributors responsible for causing those problems.
This is clearly a place where we can create a policy and enforce it if
we're seeing problems from LLMs. The missing piece for me is *are* we
seeing problems from LLMs?

I realize that's your point about wanting to be proactive rather than
reactive: Even if we're not seeing problems now, our tools are bad enough
for handling problems that we should act proactively now. I'm not sure
that I'm convinced by this. Maybe what we need instead is to be proactive
about having better tools to handle bad uploads?

> I am fairly sure not even the proponents of AI use would be happy with a
> policy of "AI use is allowed, but if you make a bad upload as a result
> of not doing a thorough review, we nuke your upload permissions and make
> you go through a tasks&skills check again", but I see very few options
> here.

I don't know, what's wrong with that? I'm not sure a single bad upload is
the right threshold, but that seems reasonable?

For me, it's a percentage thing: If you've made thousands of uploads and
you screw up one, everyone is human, it happens. If you've made two
uploads and your second one is a major mistake, half of your uploads are
major mistakes and maybe someone needs to look over what you're doing.

And yes, that does mean that someone who has a solid track record in
Debian but then goes through some change that causes their work to
significantly degrade (which could be all sorts of things from sleep
deprivation to other health issues, not just the impact of new tools) is
going to have a fair bit of leeway to screw up before someone does
something. I think that's basically correct and this is what unstable is
for, but we probably should have some threshold for intervention that's a
bit more proactive than we have right now.

-- 
Russ Allbery ([email protected])              <https://www.eyrie.org/~eagle/>

Reply via email to