Ken: Yeah, I've talked to Justin before too re FEC.

My understanding of the situation is:
- There are MDS codes and LDPC codes.
- MDS: You can decode an entire segment when you have ANY K of N blocks.
- LDPC: You can decode a segment when you have K*1.055 of N blocks.
- So MDS is slightly more space efficient.
- On the other hand, MDS has quadratic (K*M) decoding time, whereas LDPC
  is linear or close to linear. But MDS *can* decode fast (100Mbps or
  so), as long as you use small segments (say K=32).
- Most freenet nodes have an uplink of maybe 256kbps. In the long term,
  we expect splitfiles to download at around this rate, since everyone
  will be leaching all the time. :) Even if they are not, lets call it a
  couple megabits. This is way below the range Onion codecs are capable
  of with small segments. As long as we can do the decode mostly in
  parallel, there is no reason we can't use slower codes with larger
  segments. The reason decode speed is annoying right now is simply that
  we don't do it in parallel (we do it after fetching every segment).
- Almost all modern LDPC algorithms are heavily covered by patents owned
  by Digital Fountain and others, whereas MDS is free as far as I or
  Justin knows. This alone is enough to make the decision,
  unfortunately, as some of the patents involved are by no means trivial
  "webshop" patents which won't stand up in court.

Now, lets assume that there is a good patent-free LDPC code we could
use, so we still have a choice:

- LDPC uses a randomized structure which exhibits extremely poor spatial
  locality of reference. The result is that although you COULD encode a
  600MB file as a single segment, it would be *extremely* slow to decode
  unless it and its check blocks fitted entirely into RAM. And even
  then, it will be limited by your RAM speed, whereas something with
  small segments will often fit in L2 cache.
- According to the docs on the Onion codec (which is MDS), it is not
  possible to even start to decode a segment until you have K of N blocks.
  FECFile offers some level of progressive decoding but is limited by
  this fact - if we fetch data blocks randomly from the whole file, then
  there won't be any parallel decoding happening at all until we have a
  complete segment, which will likely be after we have a *LOT* of
  blocks. The result of this is that FECFile is not a great convenience;
  it may be easier to implement parallel decoding of one segment while
  we fetch the blocks of the next ourselves.
- With either codec, blocks are internally "striped"; this is why we can
  decode a 192MB (128+64) segment in 24MB of RAM. I don't know how small
  the stripes/packets can be; 1kB sounds plausible given the likely typical
  applications of these codes however. It is likely that the smaller the
  stripes/packets, the greater the setup cost...
- If we assume zero setup cost, it should be possible to use quite large
  segments and still fit the stripes being decoded in a smallish amount
  of RAM. However, reading the data in the first place (and then writing
  it after the decode) means a lot of randomized disk access. Well okay
  it doesn't have to be randomized; you can keep it on disk in
  contiguous blocks of a file, so you fetch block 1, 33, 65, 97 for the
  first stripe decode...
- So we could have segments of up to:
    (max memory usage) * (block size / packet size)
  (size after decoding; includes all data and check blocks)
- So if we have say 16MB of RAM to dedicate (we currently use 24, but we
  will need some more for temporary stuff etc), we can then have
  segments of 512MB (341MB plaintext). Which would be adequate for our
  needs. If we want segments of 256MB plaintext, we would only need 24MB
  of RAM (plus temp space); if we go for 128MB segments, we can get away
  with 12MB of RAM plus temp space, a significant improvement on current
  performance (RAM is an issue...).

A few issues Justin mentioned that are peculiar to freenet:
- It would be preferable not to have to expand the file enormously -
  e.g. Justin suggested a 32/256 code for reliability. So far we have
  used 50% (128/192) and this has appeared to be adequate.
- It is preferable to have segments as large as possible, because while
  we can rerequest blocks we won't necessarily be able to get them, and
  while healing helps a lot, it only helps the download if somebody else
  has recently downloaded the file.
- Block integrity is dealt with at a lower level; both the splitfile and
  the blocks in the splitfile have separate hashes.

BTW if you get this, Justin, thanks a lot for your previous advice - am
I right about striping?

On Thu, Sep 29, 2005 at 10:59:45PM -0400, Ken Snider wrote:
> As discussed, I was able to get some information from Justin Chapweske, one 
> of the chief engineers at openCOLA, the prior owner of the FEC codebase, 
> before Onion networks acquired it.
> 
> He had the following to say in response to your queries:
> 
> ===
> 
> The #1 most important thing is to make sure that the native/JNI FEC
> codes are being used.  They are many times faster than the pure Java
> ones.

Indeed!
> 
> Secondly, doing progressive decoding using FecFile will save you a
> lot of headaches, because as long as your decoding speed is as fast
> as your reception speed, then you'll experience no delay due to
> decoding.
> 
> As far as "using less redundancy" as Matthew suggests, I'm assume
> that he's talking about decreasing the "N" value.  In truth, the
> value of N has no impact on performance unless you're using an N >
> 256.  If N is larger than 256, then you take a 4x speed reduction as
> 16 bit codes need to be used rather than 8 bit codes.  The #1 impact
> on performance is the value of K, so if you're having problems with
> K=32, then you can try K=16, but that means that you'll have to split
> your files into even more pieces.
> 
> The other thing to look at is your I/O performance.  If you're doing
> a lot of random I/O, your files may become highly fragmented and may
> actually end up being the bottleneck in your system.  We've
> successfully used FEC for applications well over 100 Mbps, so it is
> certainly possible to do high speed data transfers, but you have to
> pay very close attention to your I/O patterns.
> 
> Besides that, you could try LDPC codes, but their non-deterministic
> nature will absolutely kill you on I/O performance, so even though
> they'll give you a boost on the CPU side, you'll quickly hit a wall
> with respect to I/O performance.
> 
> Hope this helps.
> 
> -Justin
> 
> ===
> 
> -- 
> Ken Snider
-- 
Matthew J Toseland - toad at amphibian.dyndns.org
Freenet Project Official Codemonkey - http://freenetproject.org/
ICTHUS - Nothing is impossible. Our Boss says so.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20050930/0ffe7913/attachment.pgp>

Reply via email to