> Instead, we need to get routing working well. I claim big routing > tables will help that. At least, we should try it. Can anyone tell > me why big routing tables won't help routing?
I think we need to step back and think about the problem some. I think we don't have a good idea why we get so many DNFs and I suspect the design isn't better than random routing. I suspect we don't have numbers that prove how much better than random routing we are doing and how close we are to a good theoretic estimate of how good it should be. We can increase the size, but that I don't think it will change anything. Someone I think is free to run a node with a large table and report their experience back. Love to see the numbers. I'd like to see evidence that specialization works. I'd love to see all nodes specializing. I'd love to see routing working better than random routing. I'd love to see people tackle the latency problem. For example, maybe flooding the network with routing answers during insert is better than what we have now. This way, either the routing answer is found on disk, or not and we can immediately DNF it without any request traffic to other nodes. I've not seen the proof that says this works less well than the measured actual performance of freenet today. I know that would eliminate 224 messages out of 225 messages based upon traffic to the heaviest node in my routing table. The question is would it add more than 224 messages? That I don't know. I know that for latencies, it would drop them to 20ms for a DNF (about 230x better) and for search successes it would drop them to 20ms (about 6224x better). These numbers taken from the two heaviest entries in my routing table based on messages. A freenet that can start getting data 100x to 1000x faster I think would be better than what we have today. Some might say this doesn't scale, well, that's true, but that isn't relevant, what is relevant is does is scale better than what we have today? Also, I can think of trivial ways to enhance it to make it scale better. For example, what if had a table of 50000 entries and we insert routing answers into it and when it filled, we split it in half, and half the nodes went to serve one side and other other half to our side. If a request comes in for our half, presto instant answer and if for the other side, we just forward the request on to one of the nodes serving the other half. And for n splittings, we refine it just a touch. Each table is either a table of 25k to 50k individual keys, or a table of 25k to 50k key ranges or the top table of 2-50k key ranges with bounds from 0x0 to 0xFFF... As we split, a few of the nodes form a parent node with a table with 2 elements, each element has references to all the nodes that serve the key range indicated by the element. Inbound requests go out to the closest match in our routing cache. If all of freenet had only 50k keys, then the request goes out to the node that has (or had) the data. If freenet has 50000^2 keys or less, then 1/50,000 of the keys will take 0 hops to solution, and the rest will take at most 2 hops to solution. Likewise, for 50000^5 keys or less, it would take 5 hops at most. The total key space would take 10.25 hops at most. Freenet currently on average takes 14-15 hops on those that actually find data in my store. The current size of freenet is probably around 2 hops in my scheme. This is around 7x better. My scheme can be proven to work, the current freenet scheme, well, I don't recall seeing that proof. Now, to add security, all nodes vote in such a way as to determine how the splitting is done, and who gets to be a parent. For example, for n nodes, takes the XOR of all the DSA values from all n nodes. Call this the voting key. XOR your node DSA with this voting key, call this ID. Sort the n ID values from all n nodes and then the first half form one group that takes the first half of the key space and the second half takes what remains. Want to try and grab a certain part of the key space, nearly impossible. Also, with a bottom up growth the routing tree should be fairly balanced, and we can re-balance it by informing our parent how many nodes are below us once a day. They can then suck the nodes up and put them into a sibling node if our node gets too small. Also, the splitting of a node can be done not so there are exactly 25k on each side, but rather so that the sum of the keys under each half is as near to being the same as possible. Ah, what about load balancing, beefy DSAs can sign up to serve many tables. For example, when a node splits all nodes that can handle 2x the traffic can serve both sides. Maybe we dynamically grow the table based upon the max loading a large enough set can handle. Don't split until there would be too few nodes that would remain to handle the data. Maybe the large stable nodes we migrate up the food chain to serve as parents, and as those nodes dominate the upper levels they dynamically resize to be larger because they are beefy. The rest of the nodes would just serve as data pumps, and any bandwidth not consumed by routing on the nodes that do routing would be used to pump data. Routing should be the first priority. Now, why do this, because it has a provable upper bound for a total cost of all upstreams used on all nodes to be around 800 bytes for a 4 hop search, and a provable ability to find the data. Also, at 1 mbps on a beefy node, that would translate to 500 requests a second served out of ram. For a client node of 2000 requests a day (my second highest daily peak in 30 days), that would be 21,600 clients per beefy node. A gige connection, 21 million clients. Also, inserts would be time taken to insert at an HTL that is reflective of how much redundancy you want, and how much you want to try and shield your identity. A default HTL of 2-5 might be reasonable. For daily inserts that don't have to be very redundant that don't have any anonymity concerns, insert at HTL 0 and put in routing engine and around 200 ms later, it is live to all of freenet with probability 1. Anonymity can be built on top of the routing engine on the client side. Insert for example at HTL 3-7 and then delete out of your store and request the others randomly delay registering with the routing engine for 10-20 seconds. Routing engine doesn't know who had the data first. Other nodes don't really know who had it first, but they can point to some one that had it before them (same as freenet currently). Ask random other node for key instead of the routing engine some of the time... _______________________________________________ Devl mailing list [EMAIL PROTECTED] http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/devl
