Jusa Saari wrote:

If I understood correctly, then currently, when a node queryrejects, the
QR goes back in line to the node that originated the request, and that
node will then send it to a different node. If this is true, then it has
the potential to royally mess up the network.


Assume a network where a significant amount of requests result in QR. Every time a request hits a queryrejecting node, it gets bounced back to the node that requested it. That node then resends the request to a different node.


z--a--b--c--d | | f e


Node d is queryrejecting. All of A's requests will first travel through b and c, be rejected by d, and travel through c and b back to a. Because a didn't get the data it requested, it will repeat it's request to up to n different nodes. N is potentially infinite, because, even if the node itself limits the number of nodes tried, whatever software is originating the queries (for example, FUQID) will simply keep retrying untill it gets the data. Failure tables won't really help; FUQID is trying to download a 600 MB file, and by the time it retries whatever block failed, the time to block has ended. In other words, finding the data only after several queryrejects causes more load to the network than finding it without qyeryrejects (because we bother more nodes if someone qyeryrejects).

The important thing to notice here is that passing the query causes load
in b and c, bringing them closer to queryrejecting themselves. This,
combined with the the above, means that the more queryrejecting nodes
there is, the more likely any given node is to queryreject. The end result
is a cascade failure: when the load rises high enough, some nodes start to
queryreject, which will cause more load to other nodes (localized mostly
near them in network topology; since b returned queryreject, a will next
try z), which causes some other nodes to fail, and so on.

Imagine, if you will, a sweeping wave of queryrejection going through the
entire network. Each node going to queryrejection mode will increse the
load on nearby nodes (not it's immediate neighbours, but on node z on the
other side of the requesting node), causing them to fail too. However,
once every node in a given area (in network topology) starts
queryrejecting, the requests will get stopped in their tracks, the load
will go down, and the nodes will stop queryrejecting. Then imagine several
such "query-rejection waves" interacting in a 50-dimensional network
topology (assuming that every node has fifty other nodes it routes
everything; actually, since some dimensions are one-way (someone can route
to us without being on our routing table) and other complications, saying
it's 50-dimensional is a huge oversimplification). The end result would
then be like in a crosswave; a node would fluctuate seemingly randomly
near the upper end of its maximum possible load, with sudden seemingly
random huge spikes of overload and load drops. And this behaviour will
only get worse as Freenet gets bigger (more starting points for waves,
more "wavelengths" possible, more complex multi-dimensional structure).

This is obviously bad, since it is *impossible* to predict nodes behaviour
in a crosswave, and making reliable prediction of routing times impossible
makes NGRouting useless. So, this behaviour has to be stopped. But how ?

One possible solution would be more tenious intermediate nodes. When d
queryrejects, c should try e next, and only after exhausting its
possibilities should it return QR. Similarly, b should try f before giving
up. Please note, however, that we probably want to limit this "elasticity"
to trying two or three nodes before giving up; otherwise there's a danger
that the query goes through the entire routing table and ends up being
routed to the worst possible node, instead of second or third best.

The other thing to do is to include a "time to live" to queryrejects; when
c gets the queryrejection from d, it should now know not to bother d again
for n seconds, freeing d from useless load. However, since c itself is not
overloaded, it should not pass any time to live if/when it returns the
queryreject to b.

Also, the appointment method discussed some time ago would probably be a
good idea.

My appointment method is dead, since method that depends on negative trust will fail since identity is free. (See the other threads to see wtf that means...)



Anyway, the key is to stop routing to queryrejecting nodes *immediately* and *completely* for a time specified by them, to allow the network to route around the waves, hopefully stopping the cascade reaction. This does not create any new exploits, since the nodes asking for a little peace and quiet aren't getting any extra power over others; they could simply queryreject even without this extra ability.


Comments ?

Less QR GOOOOOD! More QR BAAAAAAAAAAAD!! :)


I agree with the goal you stated more eloquently than me.

Your "time to live" QR could work, and sounds similar to my "public quotas" (see "Rethinking load-balancing") which is itself based on Ian's per-node quotas (or at least I think so).

-Martin


_______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to