Right now we divide files up into segments of 128 blocks. Each block is 32kB. 
We add another 128 "check" blocks, so that any 128 of the 256 can be used to 
reconstruct the original data and all the blocks. Thus, the odds of a full 
size segment not being fetchable are quite low, even if the chances of a 
randomly chosen block being not fetchable are relatively high.

However, this is not true for a smaller segment. For example, a DVD ISO might 
end in a 2-block segment. In which case if it can't fetch 2 of the 4 blocks 
in the last segment, the download fails! The probability of 3 blocks being 
lost of the 4 in the last segment is much higher than those of 129 blocks 
being lost in a previous segment.

Solutions?
- Split the last two segments differently: instead of 128/128 + 2/2, do say 
65/65 and 65/65.
- Use a 16-bit code (4x slower) for the last segment: 130/130.
- More check blocks: See below re small splitfiles.

Other issues with the current code:

Are 128-block segments big enough? We could increase to 1024 without too much 
impact, once db4o is merged and provided that we use blob-based tempfiles; 
memory usage would be approx 1024*2*4K, i.e. around 8MB.

What about a small splitfile which only have 2 data blocks and 2 check blocks 
in a single segment? If this is multi-level metadata then we might hope it is 
more popular ... or perhaps even if it's a small file ... but I'm not 
convinced we should rely on this. Maybe we should insert more check blocks 
for small splitfiles? How many check blocks would we need to insert for a 
reasonable success rate?

Vive can fill in the maths, it's pretty trivial but it's been a long time 
since I've done any combinatorics!

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to