On Fri, Sep 30, 2005 at 03:06:50PM -0500, Justin Chapweske wrote: > You understand the problem space well. One thing to think about on the > stripe-wise retrievil/parallel decoding is to compare your decoding rate > with your receive rate. So, if you're using K=128 that means that > decoding can begin at around 127/128 = 99.2% completion. If you're able > to decode an entire block in the time that it takes to receive a new > packet/symbol at your estimated reception rate of 256 Kbps, then you > should be able to parallelize most of the decoding. If you're finding > that the decoding can't keep up with the reception, then you can > introduce some slight preferencing into the system to retrieve "super > blocks" of data. So you split the blocks into two (or more) groups and > prioritize the first group first, then the second group second. In this > manner, with two superblocks or groups, you'll be able to start decoding > the first half at 50% completion.
I'm not sure what you mean here... If I'm able to decode a segment in the time it takes to fetch the next segment, then I don't need to block. Sorry, might be a terminology glitch: - File = the resulting data. Has an authenticator. - Block = 32kB, unit into which a file is divided (including check blocks). Has an authenticator. - Packet = 1kB (say), unit into which blocks may be striped to save memory. Does not have an authenticator. > > At the end of the day LDPC codes are a bunch of marketing BS. For most > applications, network and storage I/O is the bottleneck, not the CPU. > As long as hard disk seek latency exists, they will perform horribly for > large files. Well, there are advantages... specifically, the ability to use large segments (say 128M) with small blocks (say 32kB) (with lowish memory usage as long as you stripe it). > > In our high speed deployments, such as digital cinema where we're moving > 200GB+ files, its always the disk thats the bottleneck, never the actual > codes. Can't you keep the entire segment in RAM for such applications? Or are the blocks humongous? > > Have fun! > > -Justin > > > Now, lets assume that there is a good patent-free LDPC code we could > > use, so we still have a choice: > > > > - LDPC uses a randomized structure which exhibits extremely poor spatial > > locality of reference. The result is that although you COULD encode a > > 600MB file as a single segment, it would be *extremely* slow to decode > > unless it and its check blocks fitted entirely into RAM. And even > > then, it will be limited by your RAM speed, whereas something with > > small segments will often fit in L2 cache. > > - According to the docs on the Onion codec (which is MDS), it is not > > possible to even start to decode a segment until you have K of N blocks. > > FECFile offers some level of progressive decoding but is limited by > > this fact - if we fetch data blocks randomly from the whole file, then > > there won't be any parallel decoding happening at all until we have a > > complete segment, which will likely be after we have a *LOT* of > > blocks. The result of this is that FECFile is not a great convenience; > > it may be easier to implement parallel decoding of one segment while > > we fetch the blocks of the next ourselves. > > - With either codec, blocks are internally "striped"; this is why we can > > decode a 192MB (128+64) segment in 24MB of RAM. I don't know how small > > the stripes/packets can be; 1kB sounds plausible given the likely typical > > applications of these codes however. It is likely that the smaller the > > stripes/packets, the greater the setup cost... > > - If we assume zero setup cost, it should be possible to use quite large > > segments and still fit the stripes being decoded in a smallish amount > > of RAM. However, reading the data in the first place (and then writing > > it after the decode) means a lot of randomized disk access. Well okay > > it doesn't have to be randomized; you can keep it on disk in > > contiguous blocks of a file, so you fetch block 1, 33, 65, 97 for the > > first stripe decode... > > - So we could have segments of up to: > > (max memory usage) * (block size / packet size) > > (size after decoding; includes all data and check blocks) > > - So if we have say 16MB of RAM to dedicate (we currently use 24, but we > > will need some more for temporary stuff etc), we can then have > > segments of 512MB (341MB plaintext). Which would be adequate for our > > needs. If we want segments of 256MB plaintext, we would only need 24MB > > of RAM (plus temp space); if we go for 128MB segments, we can get away > > with 12MB of RAM plus temp space, a significant improvement on current > > performance (RAM is an issue...). > > > > A few issues Justin mentioned that are peculiar to freenet: > > - It would be preferable not to have to expand the file enormously - > > e.g. Justin suggested a 32/256 code for reliability. So far we have > > used 50% (128/192) and this has appeared to be adequate. > > - It is preferable to have segments as large as possible, because while > > we can rerequest blocks we won't necessarily be able to get them, and > > while healing helps a lot, it only helps the download if somebody else > > has recently downloaded the file. > > - Block integrity is dealt with at a lower level; both the splitfile and > > the blocks in the splitfile have separate hashes. > > > > BTW if you get this, Justin, thanks a lot for your previous advice - am > > I right about striping? > > > > On Thu, Sep 29, 2005 at 10:59:45PM -0400, Ken Snider wrote: > > > As discussed, I was able to get some information from Justin Chapweske, > > > one > > > of the chief engineers at openCOLA, the prior owner of the FEC codebase, > > > before Onion networks acquired it. > > > > > > He had the following to say in response to your queries: > > > > > > === > > > > > > The #1 most important thing is to make sure that the native/JNI FEC > > > codes are being used. They are many times faster than the pure Java > > > ones. > > > > Indeed! > > > > > > Secondly, doing progressive decoding using FecFile will save you a > > > lot of headaches, because as long as your decoding speed is as fast > > > as your reception speed, then you'll experience no delay due to > > > decoding. > > > > > > As far as "using less redundancy" as Matthew suggests, I'm assume > > > that he's talking about decreasing the "N" value. In truth, the > > > value of N has no impact on performance unless you're using an N > > > > 256. If N is larger than 256, then you take a 4x speed reduction as > > > 16 bit codes need to be used rather than 8 bit codes. The #1 impact > > > on performance is the value of K, so if you're having problems with > > > K=32, then you can try K=16, but that means that you'll have to split > > > your files into even more pieces. > > > > > > The other thing to look at is your I/O performance. If you're doing > > > a lot of random I/O, your files may become highly fragmented and may > > > actually end up being the bottleneck in your system. We've > > > successfully used FEC for applications well over 100 Mbps, so it is > > > certainly possible to do high speed data transfers, but you have to > > > pay very close attention to your I/O patterns. > > > > > > Besides that, you could try LDPC codes, but their non-deterministic > > > nature will absolutely kill you on I/O performance, so even though > > > they'll give you a boost on the CPU side, you'll quickly hit a wall > > > with respect to I/O performance. > > > > > > Hope this helps. > > > > > > -Justin > > > > > > === > > > > > > -- > > > Ken Snider > -- Matthew J Toseland - toad at amphibian.dyndns.org Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20051001/623ded99/attachment.pgp>
