Right now we divide files up into segments of 128 blocks. Each block is 32kB. We add another 128 "check" blocks, so that any 128 of the 256 can be used to reconstruct the original data and all the blocks. Thus, the odds of a full size segment not being fetchable are quite low, even if the chances of a randomly chosen block being not fetchable are relatively high.
However, this is not true for a smaller segment. For example, a DVD ISO might end in a 2-block segment. In which case if it can't fetch 2 of the 4 blocks in the last segment, the download fails! The probability of 3 blocks being lost of the 4 in the last segment is much higher than those of 129 blocks being lost in a previous segment. Solutions? - Split the last two segments differently: instead of 128/128 + 2/2, do say 65/65 and 65/65. - Use a 16-bit code (4x slower) for the last segment: 130/130. - More check blocks: See below re small splitfiles. Other issues with the current code: Are 128-block segments big enough? We could increase to 1024 without too much impact, once db4o is merged and provided that we use blob-based tempfiles; memory usage would be approx 1024*2*4K, i.e. around 8MB. What about a small splitfile which only have 2 data blocks and 2 check blocks in a single segment? If this is multi-level metadata then we might hope it is more popular ... or perhaps even if it's a small file ... but I'm not convinced we should rely on this. Maybe we should insert more check blocks for small splitfiles? How many check blocks would we need to insert for a reasonable success rate? Vive can fill in the maths, it's pretty trivial but it's been a long time since I've done any combinatorics!
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Devl mailing list [email protected] http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
