Hi Michał, On 2020/05/21 13:02, Michał Górny wrote: > On Thu, 2020-05-21 at 12:45 +0200, Jaco Kroon wrote: >> Even for v4, as an attacker ... well, as I'm sitting here right now I've >> got direct access to almost a /20 (4096 addresses). I know a number of >> people with larger scopes than that. Use bot-nets and the scope goes up >> even more. > See how unfair the world is! You are filling your bathtub with IP > addresses, and my ISP has taken mine only recently. I must admit, I work for an ISP :$ >>> Option 3: explicit CAPTCHA >>> ========================== >>> A traditional way of dealing with spam -- require every new system >>> identifier to be confirmed by solving a CAPTCHA (or a few identifiers >>> for one CAPTCHA). >>> >>> The advantage of this method is that it requires a real human work >>> to be >>> performed, effectively limiting the ability to submit spam. >>> >> Yea. One would think. CAPTCHAs are massively intrusive and in my >> opinion more effort than they're worth. >> >> This may be beneficial to *generate* a token. In other words - when >> generating a token, that token needs to be registered by way of capthca. >> >>> Other ideas >>> =========== >>> Do you have any other ideas on how we could resolve this? >>> >> Generated token + hardware based hash. > How are you going to verify that the hardware-based hash is real, > and not just a random value created to circumvent the protection?
So the generation of the hash is more to validate that it's still on the same installation (ie, not a cloned token). Sorry if that wasn't clear, so trying to solve two possible problems in one go. > >> Rate limit the combination to 1/day. >> >> Don't use included results until it's been kept up to date for a minimum >> period. Say updated at least 20 times 30 days. > For privacy reasons, we don't correlate the results. So this is > impossible to implement. Ok, but a token cannot (unless we issue it based on an email based account) be linked back to a specific user, so does it matter if we associate uploads with a token? >> The downside here is that many machines are not powered up at least once >> a day to be able to perform that initial submission sequence. So >> perhaps it's a bit stringent. > Exactly. Even once a week is a bit risky but once a day is too narrow > a period. > > To some degree, we could decide we don't care about exact numbers > as much as some degree of weighed proportions. This would mean that, > say, people who submit daily get the count of 7, at the loss of people > who don't run their machines that much. It would effectively put more > emphasis on more active users. It's debatable whether this is desirable > or not. Decaying averages. Simple to implement, don't need all historic data. > > Both the token and hardware hash can of course be tainted and is under >> "attacker control". > Exactly. So it really looks like exercise for the sake of exercise. Unless tokens are *issued* as per the rest of my email you snipped away. Wherein I proposed an issuing of both anonymous and non-anonymous tokens. Kind Regards, Jaco
