On Tue, May 27, 2025 at 10:16:25AM -0700, Russ Allbery wrote:
I would be worried about dropping the manual approval due to the sheer
volume of sophisticated automated spam account creation attacks on any
sort of authentication process with automatic sign-up.

Right now, we are in the enviable position where there is essentially no
spam via Salsa. I have seen what the level of spam looks like with an
automated sign-up process, and it would probably make me disable all of my
Salsa notifications, which would be a shame for other reasons. The only
way that companies like GitHub claw their way back from that is by having
a substantial anti-abuse team and a lot of constantly-tweaked automation
to detect and defeat spam. It is very, very easy for anything on the
Internet with public automated registration to immediately drown in SEO
spam.

Maybe there are more effective defenses than I am aware of (captcha
methods are definitely not sufficient in my experience) that we would fall
back on, and if the Salsa admins feel like this wouldn't be a problem, I
would definitely yield to their much greater experience. But it's real bad
out there in ways that I think the larger sites mostly hide because they
put a lot of resources into spam detection and prevention that we don't
have.

Based on having spent nine years running only a medium-sized site of this kind (Launchpad), I certainly agree that this sort of thing needs continuous attention. While I was there, we ran automated tools that weeded out a lot of the most obvious spam, but it was still something that typically needed a couple of hours a week of staff attention to avoid the site's reputation being trashed by hosting SEO spam (or worse), and I'd say we only did a middling job of it during my tenure. That sort of amount of time is probably more representative of the likely workload for Debian than something like GitHub's efforts would be; but it might still be more than we have available.

And I agree that captcha-type methods aren't anywhere near sufficient. We frequently saw just a few spammy actions coming in from each of hundreds of different new accounts before the account was abandoned, suggesting that either somebody had deployed ML to solve captchas automatically, or that they employed some low-waged people to create a whole bunch of accounts manually.

That said, making it easy for established users to hand out invitation tokens seems pretty harmless from this point of view (although I don't know how much effort it would be to build or maintain).

--
Colin Watson (he/him)                              [cjwat...@debian.org]

Reply via email to