If I understood correctly, then currently, when a node queryrejects, the QR goes back in line to the node that originated the request, and that node will then send it to a different node. If this is true, then it has the potential to royally mess up the network.
Assume a network where a significant amount of requests result in QR. Every time a request hits a queryrejecting node, it gets bounced back to the node that requested it. That node then resends the request to a different node.
z--a--b--c--d | | f e
Node d is queryrejecting. All of A's requests will first travel through b and c, be rejected by d, and travel through c and b back to a. Because a didn't get the data it requested, it will repeat it's request to up to n different nodes. N is potentially infinite, because, even if the node itself limits the number of nodes tried, whatever software is originating the queries (for example, FUQID) will simply keep retrying untill it gets the data. Failure tables won't really help; FUQID is trying to download a 600 MB file, and by the time it retries whatever block failed, the time to block has ended. In other words, finding the data only after several queryrejects causes more load to the network than finding it without qyeryrejects (because we bother more nodes if someone qyeryrejects).
The important thing to notice here is that passing the query causes load in b and c, bringing them closer to queryrejecting themselves. This, combined with the the above, means that the more queryrejecting nodes there is, the more likely any given node is to queryreject. The end result is a cascade failure: when the load rises high enough, some nodes start to queryreject, which will cause more load to other nodes (localized mostly near them in network topology; since b returned queryreject, a will next try z), which causes some other nodes to fail, and so on.
Imagine, if you will, a sweeping wave of queryrejection going through the entire network. Each node going to queryrejection mode will increse the load on nearby nodes (not it's immediate neighbours, but on node z on the other side of the requesting node), causing them to fail too. However, once every node in a given area (in network topology) starts queryrejecting, the requests will get stopped in their tracks, the load will go down, and the nodes will stop queryrejecting. Then imagine several such "query-rejection waves" interacting in a 50-dimensional network topology (assuming that every node has fifty other nodes it routes everything; actually, since some dimensions are one-way (someone can route to us without being on our routing table) and other complications, saying it's 50-dimensional is a huge oversimplification). The end result would then be like in a crosswave; a node would fluctuate seemingly randomly near the upper end of its maximum possible load, with sudden seemingly random huge spikes of overload and load drops. And this behaviour will only get worse as Freenet gets bigger (more starting points for waves, more "wavelengths" possible, more complex multi-dimensional structure).
This is obviously bad, since it is *impossible* to predict nodes behaviour in a crosswave, and making reliable prediction of routing times impossible makes NGRouting useless. So, this behaviour has to be stopped. But how ?
One possible solution would be more tenious intermediate nodes. When d queryrejects, c should try e next, and only after exhausting its possibilities should it return QR. Similarly, b should try f before giving up. Please note, however, that we probably want to limit this "elasticity" to trying two or three nodes before giving up; otherwise there's a danger that the query goes through the entire routing table and ends up being routed to the worst possible node, instead of second or third best.
The other thing to do is to include a "time to live" to queryrejects; when c gets the queryrejection from d, it should now know not to bother d again for n seconds, freeing d from useless load. However, since c itself is not overloaded, it should not pass any time to live if/when it returns the queryreject to b.
Also, the appointment method discussed some time ago would probably be a good idea.
My appointment method is dead, since method that depends on negative trust will fail since identity is free. (See the other threads to see wtf that means...)
Anyway, the key is to stop routing to queryrejecting nodes *immediately* and *completely* for a time specified by them, to allow the network to route around the waves, hopefully stopping the cascade reaction. This does not create any new exploits, since the nodes asking for a little peace and quiet aren't getting any extra power over others; they could simply queryreject even without this extra ability.
Comments ?
Less QR GOOOOOD! More QR BAAAAAAAAAAAD!! :)
I agree with the goal you stated more eloquently than me.
Your "time to live" QR could work, and sounds similar to my "public quotas" (see "Rethinking load-balancing") which is itself based on Ian's per-node quotas (or at least I think so).
-Martin
_______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
