Re: [gentoo-dev] [RFC] Anti-spam for goose

Jaco Kroon Thu, 21 May 2020 07:28:20 -0700

Hi Michał,

On 2020/05/21 13:02, Michał Górny wrote:
> On Thu, 2020-05-21 at 12:45 +0200, Jaco Kroon wrote:
>> Even for v4, as an attacker ... well, as I'm sitting here right now I've
>> got direct access to almost a /20 (4096 addresses).  I know a number of
>> people with larger scopes than that.  Use bot-nets and the scope goes up
>> even more.
> See how unfair the world is!  You are filling your bathtub with IP
> addresses, and my ISP has taken mine only recently.
I must admit, I work for an ISP :$
>>>     Option 3: explicit CAPTCHA
>>>     ==========================
>>>     A traditional way of dealing with spam -- require every new system
>>>     identifier to be confirmed by solving a CAPTCHA (or a few identifiers
>>>     for one CAPTCHA).
>>>
>>>     The advantage of this method is that it requires a real human work
>>>     to be
>>>     performed, effectively limiting the ability to submit spam.
>>>
>> Yea.  One would think.  CAPTCHAs are massively intrusive and in my
>> opinion more effort than they're worth.
>>
>> This may be beneficial to *generate* a token.  In other words - when
>> generating a token, that token needs to be registered by way of capthca.
>>
>>>     Other ideas
>>>     ===========
>>>     Do you have any other ideas on how we could resolve this?
>>>
>> Generated token + hardware based hash.
> How are you going to verify that the hardware-based hash is real,
> and not just a random value created to circumvent the protection?


So the generation of the hash is more to validate that it's still on the
same installation (ie, not a cloned token).  Sorry if that wasn't clear,
so trying to solve two possible problems in one go.

>
>>   Rate limit the combination to 1/day.
>>
>> Don't use included results until it's been kept up to date for a minimum
>> period.  Say updated at least 20 times 30 days.
> For privacy reasons, we don't correlate the results.  So this is
> impossible to implement.

Ok, but a token cannot (unless we issue it based on an email based
account) be linked back to a specific user, so does it matter if we
associate uploads with a token?

>> The downside here is that many machines are not powered up at least once
>> a day to be able to perform that initial submission sequence.  So
>> perhaps it's a bit stringent.
> Exactly.  Even once a week is a bit risky but once a day is too narrow
> a period.
>
> To some degree, we could decide we don't care about exact numbers
> as much as some degree of weighed proportions.  This would mean that,
> say, people who submit daily get the count of 7, at the loss of people
> who don't run their machines that much.  It would effectively put more
> emphasis on more active users.  It's debatable whether this is desirable
> or not.
Decaying averages.  Simple to implement, don't need all historic data.
>
> Both the token and hardware hash can of course be tainted and is under
>> "attacker control".
> Exactly.  So it really looks like exercise for the sake of exercise.

Unless tokens are *issued* as per the rest of my email you snipped
away.  Wherein I proposed an issuing of both anonymous and non-anonymous
tokens.

Kind Regards,
Jaco

Re: [gentoo-dev] [RFC] Anti-spam for goose

Reply via email to