[freenet-dev] Question about an important design decision of the WoT plugin

Evan Daniel Wed, 13 May 2009 14:32:45 -0400

On Wed, May 13, 2009 at 12:58 PM, Matthew Toseland
<toad at amphibian.dyndns.org> wrote:
> On Wednesday 13 May 2009 15:47:24 Evan Daniel wrote:
>> On Wed, May 13, 2009 at 9:03 AM, Matthew Toseland
>> <toad at amphibian.dyndns.org> wrote:
>> > On Friday 08 May 2009 02:12:21 Evan Daniel wrote:
>> >> On Thu, May 7, 2009 at 6:33 PM, Matthew Toseland
>> >> <toad at amphibian.dyndns.org> wrote:
>> >> > On Thursday 07 May 2009 21:32:42 Evan Daniel wrote:
>> >> >> On Thu, May 7, 2009 at 2:02 PM, Thomas Sachau <mail at tommyserver.de>
>> > wrote:
>> >> >> > Evan Daniel schrieb:
>> >> >> >> I don't have any specific ideas for how to choose whether to ignore
>> >> >> >> identities, but I think you're making the problem much harder than
> it
>> >> >> >> needs to be. ?The problem is that you need to prevent spam, but at
> the
>> >> >> >> same time prevent malicious non-spammers from censoring identities
> who
>> >> >> >> aren't spammers. ?Fortunately, there is a well documented algorithm
>> >> >> >> for doing this: the Advogato trust metric.
>> >> >> >>
>> >> >> >> The WoT documentation claims it is based upon the Advogato trust
>> >> >> >> metric. ?(Brief discussion:
> http://www.advogato.org/trust-metric.html
>> >> >> >> Full paper: http://www.levien.com/thesis/compact.pdf ) ?I think
> this
>> >> >> >> is wonderful, as I think there is much to recommend the Advogato
>> >> >> >> metric (and I pushed for it early on in the WoT discussions).
>> >> >> >> However, my understanding of the paper and what is actually
>> >> >> >> implemented is that the WoT code does not actually implement it.
>> >> >> >> Before I go into detail, I should point out that I haven't read the
>> >> >> >> WoT code and am not fully up to date on the documentation and
>> >> >> >> discussions; if I'm way off base here, I apologize.
>> >> >> >
>> >> >> > I think, you are:
>> >> >> >
>> >> >> > The advogato idea may be nice (i did not read it myself), if you
> have
>> >> > exactly 1 trustlist for
>> >> >> > everything. But xor wants to implement 1 trustlist for every app as
>> > people
>> >> > may act differently e.g.
>> >> >> > on firesharing than on forums or while publishing freesites. You
>> > basicly
>> >> > dont want to censor someone
>> >> >> > just because he tries to disturb filesharing while he may be tries
> to
>> >> > bring in good arguments at
>> >> >> > forum discussions about it.
>> >> >> > And i dont think that advogato will help here, right?
>> >> >>
>> >> >> There are two questions here. ?The first question is given a set of
>> >> >> identities and their trust lists, how do you compute the trust for an
>> >> >> identity the user has not rated? ?The second question is, how do you
>> >> >> determine what trust lists to use in which contexts? ?The two
>> >> >> questions are basically orthogonal.
>> >> >>
>> >> >> I'm not certain about the contexts issue; Toad raised some good
>> >> >> points, and while I don't fully agree with him, it's more complicated
>> >> >> than I first thought. ?I may have more to say on that subject later.
>> >> >>
>> >> >> Within a context, however, the computation algorithm matters. ?The
>> >> >> Advogato idea is very nice, and imho much better than the current WoT
>> >> >> or FMS answers. ?You should really read their simple explanation page.
>> >> >> ?It's really not that complicated; the only reasons I'm not fully
>> >> >> explaining it here is that it's hard to do without diagrams, and they
>> >> >> already do a good job of it.
>> >> >
>> >> > It's nice, but it doesn't work. Because the only realistic way for
>> > positive
>> >> > trust to be assigned is on the basis of posted messages, in a purely
>> > casual
>> >> > way, and without the sort of permanent, universal commitment that any
>> >> > pure-positive-trust scheme requires: If he spams on any board, if I
> ever
>> > gave
>> >> > him trust and haven't changed that, then *I AM GUILTY* and *I LOSE
> TRUST*
>> > as
>> >> > the only way to block the spam.
>> >>
>> >> How is that different than the current situation? ?Either the fact
>> >> that he spams and you trust him means you lose trust because you're
>> >> allowing the spam through, or somehow the spam gets stopped despite
>> >> your trust -- which implies either that a lot of people have to update
>> >> their trust lists before anything happens, and therefore the spam
>> >> takes forever to stop, or it doesn't take that many people to censor
>> >> an objectionable but non-spamming poster.
>> >>
>> >> I agree, this is a bad thing. ?I'm just not seeing that the WoT system
>> >> is *that* much better. ?It may be somewhat better, but the improvement
>> >> comes at a cost of trading spam resistance vs censorship ability,
>> >> which I think is fundamentally unavoidable.
>> >
>> > So how do you solve the contexts problem? The only plausible way to add
> trust
>> > is to do it on the basis of valid messages posted to the forum that the
> user
>> > reads. If he posts nonsense to other forums, or even introduces identities
>> > that spam other forums, the user adding trust probably does not know about
>> > this, so it is problematic to hold him responsible for that. In a positive
>> > trust only system this is unsolvable afaics?
>> >
>> > Perhaps some form of feedback/ultimatum system? Users who are affected by
> spam
>> > from an identity can send proof that the identity is a spammer to the
> users
>> > they trust who trust that identity. If the proof is valid, those who trust
>> > the identity can downgrade him within a reasonable period; if they don't
> do
>> > this they get downgraded themselves?
>>
>> I don't have an easy solution for the contexts issue. ?As I see it,
>> there are several related but distinct issues:
>> -- Given a set of trust ratings, what is the algorithm to compute
>> trust for a distant node? (Advogato vs current WoT / FMS vs something
>> else)
>> -- What trust ratings should we use? ?(AKA the contexts problem.)
>> -- How do identity introductions work?
>>
>> Contexts and introductions are probably hard to get right with either
>> trust metric; which metric you use has some affect on the correct
>> answer to those problems, but the effect should be small in
>> comparison. ?I think some of the discussion in this thread has been
>> conflating the choice of metric with the other two problems more than
>> is required.
>
> The current system (negative trust being allowed) does provide a partial
> solution to the contexts problem.
>>
>> The feedback system could be simpler than you suggest, I think. ?If I
>> publish a list of people I have marked as spammers, and you trust me,
>> then when you see someone added to my list who you have marked as
>> trusted, you can look at their recent postings and make a decision.
>> (The notification being automated, presumably.)
>
> It's not the postings that matter. The postings I have seen are valid. It's
> probably the spam identities he trusts. And if not them, then it's the
> messages he posts to other boards - which conceivably we could ask the user
> to look at.


IMHO these are not solutions to the contexts problem -- it merely
shifts the balance between allowing spam and allowing censorship.  In
one case, the attacker can build trust in one context and use it to
spam a different context.  In the other case, he can build trust in
one context and use it to censor in another.

Right now, the only good answer I see to contexts is to make them
fully independent.  Perhaps I missed it, but I don't recall a
discussion of how any other option would work in any detail -- the
alternative under consideration appears to be to treat everything as
one unified context.  I'm not necessarily against that, but the
logical conclusion is that you're responsible for paying attention to
everything someone you've trusted does in all contexts in which you
trust them -- which, for a unified context, means everywhere.

>>
>> >> There's another reason I don't see this as a problem: I'm working from
>> >> the assumption that if you can force a spammer to perform manual
>> >> effort on par with the amount of spam he can send, then the problem
>> >> *has been solved*. ?The reason email spam and Frost spam is a problem
>> >> is not that there are lots of spammers; there aren't. ?It's that the
>> >> spammers can send colossal amounts of spam.
>> >
>> > Agreed. However positive trust as currently envisaged does not have this
>> > property, because spammers can gain trust by posting valid messages and
> then
>> > use it to introduce spamming identities. Granted there is a limited
> capacity,
>> > but they can gain lots of trust by posting, and can therefore send a lot
> of
>> > spam via their trusted identities: the multiplier is still pretty good,
>> > although maybe not hideous.
>>
>> For the majority of users, the spammer will be far from the trust tree
>> root, and therefore have small capacity. ?In order to introduce lots
>> of fake identities, they have to move higher up the tree. ?Doing so
>> requires increased manual effort. ?So introducing more fake identities
>> requires increased manual effort.
>
> True... but the effect of capacity is to control the number of spam identities
> they can run at any given time. It doesn't prevent those identities from
> spamming continually up to the autodetection limit.

Yes and no.  If an identity posts 100 pieces of spam to a board I'm
reading, it only requires my intervention once when I mark the first
piece as spam; that's no worse than if he'd posted 1 piece of spam.
OTOH, that does represent 100 keys my node has to fetch.  I suspect
that can be solved by tweaking the retrieval limit rules.

>>
>> Also, I don't see how this attack is specific to the Advogato metric.
>> It works equally well in WoT / FMS. ?The only thing stopping it there
>> is users manually examining each other's trust lists to look for such
>> things. ?If you assume equally vigilant users with Advogato the attack
>> is irrelevant.
>
> It is solvable with positive trust, because the spammer will gain trust from
> posting messages, and lose it by spamming. The second party will likely be
> the stronger in most cases, hence we get a zero or worse outcome.

Which second party?

>>
>> >> The solution, imho, is mundane: if the occasional trusted identity
>> >> starts a spam campaign, I mark them as a spammer. ?This is optionally
>> >> published, but can be ignored by others to maintain the positive trust
>> >> aspects of the behavior. ?Locally, it functions as a slightly stronger
>> >> killfile: their messages get ignored, and their identity's trust
>> >> capacity is forced to zero.
>> >
>> > Does not protect against a spammer's parent identity introducing more
>> > spammers. IMHO it is important that if an identity trusts a lot of
> spammers
>> > it gets downgraded - and that this be *easy* for the user.
>>
>> The Advogato algorithm protects against this, though passively. ?The
>> spammer's node has a capacity limit based on how far from the tree
>> root it is. ?How much trust it has is limited by how much it receives
>> from upstream nodes, capped by its capacity. ?Regardless of the number
>> of identities it trusts, the number that get accepted as trusted is
>> limited by the amount of trust the main node can get. ?The only issue
>> I see is that you would want to limit the churn rate -- it does no
>> good to have limited the spammer to 5 child identities if he can send
>> a few messages, unmark those identities, and mark some new ones.
>
> This is logical.
>>
>> Having a way for me to easily realize that an identity I have trusted
>> is trusting spammers is important. ?However, I think part of avoiding
>> censorship is making this be a manually verified process, and *not* an
>> automated, recursive part of the normal computation algorithm.
>
> Yes, we should ask the user, but the converse is if a user doesn't visit his
> node for a while, everyone will have blacklisted him because he didn't
> blacklist the spammers he trusted.

Isn't the same true with other metrics?  If someone trusts spammers in
WoT, I'll mark them down?  You tackle this problem with rules that
attempt to ignore trust lists from people who aren't active, or by
deciding that out of date trust lists shouldn't be trusted, and
therefore the blacklisting is appropriate.  For example, if I mark
someone as a spammer, WoT could then know to start ignoring trust
lists from people who trust him and haven't been active in the last
[time period].  Obviously you want to keep using trust lists that are
still accurate and belong to people who are merely lurking, so the
problem has some subtlety to it, but I don't think it's a hard one.

>>
>> >> In the context of the routing and data store algorithms, Freenet has a
>> >> strong prejudice against alchemy and in favor of algorithms with
>> >> properties that are both useful and provable from reasonable
>> >> assumptions, even though they are not provably perfect. ?Like routing,
>> >> the generalized trust problem is non-trivial. ?Advogato has such
>> >> properties; the current WoT and FMS algorithms do not: they are
>> >> alchemical. ?In addition, the Advogato metric has a strong anecdotal
>> >> success story in the form of the Advogato site (I've not been active
>> >> on FMS/Freetalk recently enough to speak to them). ?Why is alchemy
>> >> acceptable here, but not in routing?
>> >
>> > Because the provable metrics don't work for our scenario. At least they
> don't
>> > work given the current assumptions and formulations.
>>
>> Could you be more specific? ?This thread is covering several closely
>> related but distinct subjects, so I'm not really sure exactly which
>> assumptions you're referring to. ?Also, do you mean that they don't
>> work in the sense that the proof is no longer applicable or
>> mathematically valid, or in the sense that the results of the proof
>> aren't useful?
>
> The latter. Pure positive only works if every user can be trusted to
> continually evaluate his peers' messages to all contexts, and their
> relationships to other users, and can therefore be blocked if they propagate
> messages of spammers.

OK.

I think you really mean "Pure positive only works *perfectly* if every
user..."  We don't need a perfect system that stops all spam and
nothing else.  Any system will have some failings.  Minimizing those
failings should be a design goal, but knowing where we expect those
failings to be, and placing them where we want them, is also an
important goal.

Or, looked at another way:  We have ample evidence that people will
abuse the new identity creation process to post spam.  That is a
problem worth expending significant effort to solve.  Do we have
evidence that spammers will actually exert per-identity manual effort
in order to send problematic amounts of spam?  Personally, I'm not
worried about there being a little bit of spam; I'm worried about it
overwhelming the discussions and making the system unusable.  My
intuition tells me that we need defenses against such attacks, but
that they can be fairly minimal -- provided the defenses against
new-identity labor-free spam are strong.

However, I've seen enough flames flying over the issue of mob
censorship that I believe that problem to be real and worth worrying
about.  In the absence of a system that simultaneously solves both
problems (something which I suspect is, at a fairly fundamental level,
not doable), I am inclined to place the strengths of the system in
avoiding new-identity spam and avoiding mob censorship, and decide
that having the inevitable weaknesses be against attacks by
established, trusted identities is acceptable.

Evan Daniel

[freenet-dev] Question about an important design decision of the WoT plugin

Reply via email to