Our MHK tests have reported back in. We insert one key 3 times, and then insert 3 keys once. We then fetch all of them after a week. If the one key succeeds, that is a success for repeatedly inserting a key. If any of the 3 keys succeeds, that is a success for MHKs (MHKs are a proposed feature involving encrypting the top block in 3 different ways and inserting all 3 blocks, such that if we get any of them we can regenerate the others and deliver the data). We can also calculate the single block success ratio, and compare that to our other data on weekly success stats.
Our long-term push/pull stats. These involve pushing a 64KB splitfile headed by a KSK, and fetching it after 2^N+1 days. We insert N+1 blocks on a day, so each test insert is only fetched once, after the given period. Success rate for 0 days: 1.0 (131 samples) Success rate for 1 days: 0.8837209302325582 (129 samples) Success rate for 3 days: 0.8203125 (128 samples) Success rate for 7 days: 0.784 (125 samples) Success rate for 15 days: 0.6752136752136753 (117 samples) Success rate for 31 days: 0.4752475247524752 (101 samples) Success rate for 63 days: 0.10126582278481013 (79 samples) Success rate for 127 days: 0.0 (24 samples) We also have more accurate stats based on single blocks with no retrying (rather than the usual 3), which are rerequested after 0, 1, 3, 7, 15, ... days: 1 days: Total fetches: 1920 total successes 1471 = 76.61458333333333% in 32896.885791978246ms 3 days: Total fetches: 1856 total successes 1238 = 66.70258620689656% in 32386.7310177706ms 7 days: Total fetches: 1728 total successes 1040 = 60.18518518518518% in 36267.77115384615ms 15 days: Total fetches: 1504 total successes 708 = 47.07446808510638% in 40448.608757062146ms 31 days: Total fetches: 992 total successes 365 = 36.79435483870968% in 44799.78082191781ms So generally, we have a serious problem with data retention. But now lets look at the MHK data. This also involves 64KB random splitfiles, but they are headed by CHKs. Everything I know suggests that KSKs will survive better than CHKs, but we can test this theory if it is vital. Total attempts where insert succeeded and fetch executed: 54 Single keys succeeded: 52 MHKs succeeded: 50 Single key individual fetches: 162 Single key individual fetches succeeded: 124 Success rate for individual keys (from MHK inserts): 0.7654320987654321 Success rate for the single key triple inserted: 0.9629629629629629 Success rate for the MHK (success = any of the 3 different keys worked): 0.9259259259259259 So the 1 week success rate for a single block is 76.5%, which is close to the 78% reported by the long-term test with more blocks. The success rate for an MHK is 92.6%, which is plausible given the single block success rate. But look at the success rate for a single key triple inserted, 96.2%. Statistically there is probably no difference between this and the MHK rating, but the interesting thing is it is so high! ***Why does a single insert repeated 3 times persist so much better than a single insert?*** Theories: 1. It gets routed differently because of per-node failure tables. This is WRONG: Inserts don't go into per-node failure tables if they succeed, in fact inserts don't update the failure table at all, although it is taken into account when routing. 2. The problem is 3637/3639: We don't store if the HTL is still too high to cache (a security measure introduced some months back), and sometimes, because of random HTL decrementing, this happens to still be the case when we reach the nodes which would be sinks for the data, and would store it in the store (not the cache). This is also WRONG, at least in its unmodified form, because if we follow the same route we will make the same caching decisions. However, if for some reason we follow a different route, it may well be important. 3. It didn't go the same route because of temporary backoff or preemptive rejection, so the 3 inserts have a much better chance of hitting the sink nodes, because at least one path to the sink nodes will have low enough HTL to store it (3637/3639). 4. It didn't go the same route because of temporary backoff or preemptive rejection, so the 3 inserts have a much better chance of hitting the sink nodes, because we have severe misrouting and frequently miss the sink nodes altogether. 5. It didn't go the same route because of temporary backoff or preemptive rejection, and the 3 routes are sufficiently different as to cover a lot more nodes, and we cache on more nodes without storing on many more, and we fetch it from the cache in future because of future misrouting; or we store on more nodes which somehow are misplaced; either way the network is rather chaotic. So we need to determine which theory is correct. Clearly not the first two. If routing is working well and misrouting is not particularly important, then #3. If misrouting is a common problem that prevents us from reaching the sink nodes, then #4 or #5. Either way we need to fix 3637/3639 as a matter of urgency, and see how this affects the test results. If there remains a dramatic gap between performance of an insert done once and one done 3 times, then we should consider some more radical options: - Increasing the max HTL on inserts but not on requests. - Queueing inserts until they can get a reasonably optimal next hop. If the decay is consistent - which it may not be but it should be close to - and if the actual rate is 90%, which IMHO is plausible, although we don't have many samples here, then this makes a huge difference to fetchability: Block reachability after 1 week Half-life 0.765 2-3 weeks 0.9 6-7 weeks 0.92 8-9 weeks 0.96 17 weeks We will need to increase our rate of gathering data in order to be sure reasonably quickly, by e.g. making the MHK tester insert many more blocks per day in the same pattern. We may also need to dump some of the other routing/storage related changes that are already in trunk so that we are not measuring too many changes at once. In the longer term, inserting top or lone blocks multiple times to improve persistence is definitely a good idea. But IMHO if we investigate this properly we may be able to improve persistence dramatically for *all* blocks, and therefore for content in general. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20100121/d9080fc3/attachment.pgp>
