[freenet-dev] FCP FEC Proposal -- in message body

2002-09-17 Thread Robert Bihlmeyer
Edgar Friendly  writes:

> Having check blocks smaller than data blocks means that to correct one
> missing key, you're going to have to retrieve more than one check
> block.  And if the check blocks are bigger...  well, I guess you lose
> most of the advantages of splitfiles; might as well just make all the
> blocks bigger.

I'd really like to see a FEC algorithm that uses differently-sized
checkblocks, and the benefits that this algorithm actually buys us.
Until then I'm going to believe that this is just an
overgeneralisation.

> The next best solution (that still allows variable block sizes) is
> to have an optional field that gives the size of blocks when they're
> the same size, and to just have clients "deal with it" when it's not
> known.

Sounds good. Please don't deprecate BlockSize, just make it optional
(IIRC it already is at the moment). Users of Vandermonde/OnionNetworks
FEC should always insert with this field.

> I think that having multiple algorithms is going to result in either
> noone using FEC or clients implementing their own FEC that's only
> compatible with other copies of that same client.

I don't think so. If the codec is available in source in an accessible
language (C preferred), under a liberal license, it will be
incorporated into all maintained clients.

> actually, it's worth quite a lot in terms of future growth; as better
> methods of deciding how to XOR data blocks together to produce check
> blocks are discovered, they can be used without requiring all client
> authors to rewrite their code.

That's true. But I'm not envisioning changes in FEC technology as
frequent as you seem to.

-- 
Robbe
-- next part --
A non-text attachment was scrubbed...
Name: signature.ng
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-17 Thread Edgar Friendly
Robert Bihlmeyer  writes:

> I'd really like to see a FEC algorithm that uses differently-sized
> checkblocks, and the benefits that this algorithm actually buys us.
> Until then I'm going to believe that this is just an
> overgeneralisation.
> 
I agree with you that it's not needed.

> Sounds good. Please don't deprecate BlockSize, just make it optional
> (IIRC it already is at the moment). Users of Vandermonde/OnionNetworks
> FEC should always insert with this field.
> 
I have no plans to deprecate any fields.

> > I think that having multiple algorithms is going to result in either
> > noone using FEC or clients implementing their own FEC that's only
> > compatible with other copies of that same client.
> 
> I don't think so. If the codec is available in source in an accessible
> language (C preferred), under a liberal license, it will be
> incorporated into all maintained clients.
> 
that sounds easier than it actually is.  very few languages' bindings
are easier to use than just writing the code to decode graph style
redundant splitfiles[1]

> > actually, it's worth quite a lot in terms of future growth; as better
> > methods of deciding how to XOR data blocks together to produce check
> > blocks are discovered, they can be used without requiring all client
> > authors to rewrite their code.
> 
> That's true. But I'm not envisioning changes in FEC technology as
> frequent as you seem to.
> 
> -- 
> Robbe

I'm doing my own research in the area, and I can assure you there's
plenty of progress to be made.

Thelema

[1] I'd like to coin the term "rsplit" for this, instead of typing out
"redundant splitfile" as often as I do.
-- 
E-mail: thelema314 at bigfoot.comRaabu and Piisu
GPG 1024D/36352AAB fpr:756D F615 B4F3 BFFC 02C7  84B7 D8D7 6ECE 3635 2AAB

___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-17 Thread Edgar Friendly

Robert Bihlmeyer <[EMAIL PROTECTED]> writes:

> I'd really like to see a FEC algorithm that uses differently-sized
> checkblocks, and the benefits that this algorithm actually buys us.
> Until then I'm going to believe that this is just an
> overgeneralisation.
> 
I agree with you that it's not needed.

> Sounds good. Please don't deprecate BlockSize, just make it optional
> (IIRC it already is at the moment). Users of Vandermonde/OnionNetworks
> FEC should always insert with this field.
> 
I have no plans to deprecate any fields.

> > I think that having multiple algorithms is going to result in either
> > noone using FEC or clients implementing their own FEC that's only
> > compatible with other copies of that same client.
> 
> I don't think so. If the codec is available in source in an accessible
> language (C preferred), under a liberal license, it will be
> incorporated into all maintained clients.
> 
that sounds easier than it actually is.  very few languages' bindings
are easier to use than just writing the code to decode graph style
redundant splitfiles[1]

> > actually, it's worth quite a lot in terms of future growth; as better
> > methods of deciding how to XOR data blocks together to produce check
> > blocks are discovered, they can be used without requiring all client
> > authors to rewrite their code.
> 
> That's true. But I'm not envisioning changes in FEC technology as
> frequent as you seem to.
> 
> -- 
> Robbe

I'm doing my own research in the area, and I can assure you there's
plenty of progress to be made.

Thelema

[1] I'd like to coin the term "rsplit" for this, instead of typing out
"redundant splitfile" as often as I do.
-- 
E-mail: [EMAIL PROTECTED]Raabu and Piisu
GPG 1024D/36352AAB fpr:756D F615 B4F3 BFFC 02C7  84B7 D8D7 6ECE 3635 2AAB

___
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-17 Thread Robert Bihlmeyer

Edgar Friendly <[EMAIL PROTECTED]> writes:

> Having check blocks smaller than data blocks means that to correct one
> missing key, you're going to have to retrieve more than one check
> block.  And if the check blocks are bigger...  well, I guess you lose
> most of the advantages of splitfiles; might as well just make all the
> blocks bigger.

I'd really like to see a FEC algorithm that uses differently-sized
checkblocks, and the benefits that this algorithm actually buys us.
Until then I'm going to believe that this is just an
overgeneralisation.

> The next best solution (that still allows variable block sizes) is
> to have an optional field that gives the size of blocks when they're
> the same size, and to just have clients "deal with it" when it's not
> known.

Sounds good. Please don't deprecate BlockSize, just make it optional
(IIRC it already is at the moment). Users of Vandermonde/OnionNetworks
FEC should always insert with this field.

> I think that having multiple algorithms is going to result in either
> noone using FEC or clients implementing their own FEC that's only
> compatible with other copies of that same client.

I don't think so. If the codec is available in source in an accessible
language (C preferred), under a liberal license, it will be
incorporated into all maintained clients.

> actually, it's worth quite a lot in terms of future growth; as better
> methods of deciding how to XOR data blocks together to produce check
> blocks are discovered, they can be used without requiring all client
> authors to rewrite their code.

That's true. But I'm not envisioning changes in FEC technology as
frequent as you seem to.

-- 
Robbe



signature.ng
Description: PGP signature


[freenet-dev] FCP FEC Proposal -- in message body

2002-09-14 Thread fish


On 13 Sep 2002, Edgar Friendly wrote:

> > The Metadata spec should probably list the known values for this
> > field, and give links to further documentation. A genuine registry for
> > these may be needed, but I doubt that there will be more than 3
> > algorithms used in general ...
> > 
> I think that having multiple algorithms is going to result in either
> noone using FEC or clients implementing their own FEC that's only
> compatible with other copies of that same client.  Trying to
> standardize on OnionNetworks' code is going to result in orphaning
> platforms that code doesn't run on.  (I know there's a java version of
> their code; I also know that there's a lot of platforms java doesn't
> run on)

Not to be blunt, but last I checked, and correct me if I'm wrong, freenet
is written in java, and therefore platforms on which you cannot either
compile java source, or execute java bytecode, are already orphaned.

What's more, if your platform can't compile ansi C, of which there is a
version of the library available, you're pretty fucked anyhow, and FEC is
the least of your problems.

> > > * SplitFile.Graph is currently not being used and is not implemented.
> > 
> > Delivering the blueprint for reassembly with every splitfile has it's
> > appeal, but is probably more redundancy than it's worth.
> > 
> actually, it's worth quite a lot in terms of future growth; as better
> methods of deciding how to XOR data blocks together to produce check
> blocks are discovered, they can be used without requiring all client
> authors to rewrite their code.
> 
> the methods that are being developed that don't use Graph have the
> disadvantages of 
> 1) not being extensible to future advances in FEC technology
> 2) not being able to correct anywhere near as many error patterns.

You assume there is only one way to do FEC, and that the only change will
be in the matricies used.  I believe that you are very, very wrong.  I
honestly don't believe that splitfile.graph will help us a lot, unless it
happens to include java bytecode for reassembly, which you're objected to
relying on anyhow :-p

- from fish with love.


___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-14 Thread Edgar Friendly
fish  writes:

> Not to be blunt, but last I checked, and correct me if I'm wrong, freenet
> is written in java, and therefore platforms on which you cannot either
> compile java source, or execute java bytecode, are already orphaned.
> 
clients don't have to run on the same architectures the servers do.
And there's nothing inherent about java that a freenet node can't be
written in another language (which there's already projects on the way
to accomplishing)

> What's more, if your platform can't compile ansi C, of which there is a
> version of the library available, you're pretty fucked anyhow, and FEC is
> the least of your problems.
> 
I didn't realize there was a version of the library written in ansi C.
my bad.

> 
> You assume there is only one way to do FEC, and that the only change will
> be in the matricies used.  I believe that you are very, very wrong.  I
> honestly don't believe that splitfile.graph will help us a lot, unless it
> happens to include java bytecode for reassembly, which you're objected to
> relying on anyhow :-p
> 
>   - from fish with love.
> 
There is only one way to do linear FEC[1], and it does reduce down to
the matrix used for creating check blocks.  the coding scheme that the
OnionNetworks code uses just assumes a pre-generated matrix for doing
all encoding/decoding with.  The matrix that they use is nice because
its submatrices used for decoding are always invertible, but the
(pretty big) downside is that the matrix construction quickly becomes
non-scalable because of having to jump to larger finite fields to do
the math in.

There's plenty of room for future research in systems where you're
working over Z_2 (using binary matrices), because you're not always
guaranteed invertability, but you never have to worry about scaling
issues because you stay in the same finite field.

Thelema
[1]  non-linear FEC is really rough stuff, and probably not computationally 
useful.
-- 
E-mail: thelema314 at bigfoot.comRaabu and Piisu
GPG 1024D/36352AAB fpr:756D F615 B4F3 BFFC 02C7  84B7 D8D7 6ECE 3635 2AAB

___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-13 Thread Edgar Friendly
Robert Bihlmeyer  writes:

> Gianni Johansson  writes:
> 
> > C. Within a segment all data blocks must be the same size and all
> > check blocks must be the same size.  The check block and data block
> > sizes are not required to be the same however.  Smaller trailing
> > blocks must be zero padded to the required length.
> 
> Ugh, do we need this bloat on the wire? An exception for the last
> data block would be preferable.
> 
The exception for the last data block is still assumed.  What GJ's
arguing for is being able to have data blocks of 256K and check blocks
of 128K, or something like that.  It's a nice generalization, but
IMNSHO, it doesn't gain that much.

Having check blocks smaller than data blocks means that to correct one
missing key, you're going to have to retrieve more than one check
block.  And if the check blocks are bigger...  well, I guess you lose
most of the advantages of splitfiles; might as well just make all the
blocks bigger.

> > 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> > same size as data blocks and blocks may be different sizes across segments.
> 
> I'd still like to know the data and check block sizes for a segment
> beforehand. Will I be able to deduce these?
> 
At the moment, the only good solution to your problem is to include
offsets for each block, and this is a pretty wasteful method, so we're
not going with it.  The next best solution (that still allows variable
block sizes) is to have an optional field that gives the size of
blocks when they're the same size, and to just have clients "deal with
it" when it's not known.

> > 1) Add an AlgoName field. This is the name for the decoder and encoder 
> > implementation, that can be used to decode or re-encode the file. 
> > This replaces decoder.name and decoder.encoder in the previous
> > implementation.
> 
> The Metadata spec should probably list the known values for this
> field, and give links to further documentation. A genuine registry for
> these may be needed, but I doubt that there will be more than 3
> algorithms used in general ...
> 
I think that having multiple algorithms is going to result in either
noone using FEC or clients implementing their own FEC that's only
compatible with other copies of that same client.  Trying to
standardize on OnionNetworks' code is going to result in orphaning
platforms that code doesn't run on.  (I know there's a java version of
their code; I also know that there's a lot of platforms java doesn't
run on)

> > * SplitFile.Graph is currently not being used and is not implemented.
> 
> Delivering the blueprint for reassembly with every splitfile has it's
> appeal, but is probably more redundancy than it's worth.
> 
actually, it's worth quite a lot in terms of future growth; as better
methods of deciding how to XOR data blocks together to produce check
blocks are discovered, they can be used without requiring all client
authors to rewrite their code.

the methods that are being developed that don't use Graph have the
disadvantages of 
1) not being extensible to future advances in FEC technology
2) not being able to correct anywhere near as many error patterns.

> -- 
> Robbe

Thelema
-- 
E-mail: thelema314 at bigfoot.comRaabu and Piisu
GPG 1024D/36352AAB fpr:756D F615 B4F3 BFFC 02C7  84B7 D8D7 6ECE 3635 2AAB

___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-13 Thread fish


On Thu, 12 Sep 2002, Gianni Johansson wrote:

> It's good to make the distinction because if you manage to get all the data 
> blocks you don't need to decode at all.

bob the angry flower says "wrong,m wrong, wrong ,wrong, where did you
learn that??! wrong!!!" :-p.  Seriously, this is a very bad thing to be
encouraging, for two reasons:

a) the checkblocks will fall oiff freenet - you should never be retrieving
this data significantly!  (i wrote about that in the documentation i
wrote, I won't reepat that here)

> Client writers just don't want to deal with it.  FEC support has been in CVS 
> for almost a year and you are the first person who has attempted to write a 
> client besides me.

I think that's got more to do with the fact that there was no
documentation, and hence it took me a solid week to work everything out to
the point where it worked where I tested it (and yes, before anyone asks,
I did try just asking GJ first.  4 times (this was back last march.  I
also asked about the fproxy changes, to the same lack of result).  it
seems that any email I send to GJ gets redirected to /dev/null :().  And
then I *still* got it wrong and didn't know until fproxy was patched,
which I found out about when people started bitching about my tools
breaking freenet.

Hell, I had to write *documentation* to get your attention :-p.  Talk
about sinking to a new low :-p.

> > Should we be perhaps looking for a library which doens't suffer from
> these > problems as much as onion's library, however?  The glass is
> half full man.  The math works for files out to about 1G no problem
> even with segmentation for the current implementation.
> 
> The OnionNetworks code works well for moderately sized files now!
> We should make it easily available.

I agree with this... i wasn't trying to suggest otherwise, it was just
random musings. I get emails asking me why it's still hard to insert iso's
into freenet :-p (hey, did you know that onionnetwork's c library will
bluescreen a winxp box when you feed it an iso? :-p)

> Segmenting reduces redundancy not striping.  (see above)

okay, i got the two confused ^_^

- from fish with love.


___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Robert Bihlmeyer
Gianni Johansson  writes:

> C. Within a segment all data blocks must be the same size and all
> check blocks must be the same size.  The check block and data block
> sizes are not required to be the same however.  Smaller trailing
> blocks must be zero padded to the required length.

Ugh, do we need this bloat on the wire? An exception for the last
data block would be preferable.

> 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> same size as data blocks and blocks may be different sizes across segments.

I'd still like to know the data and check block sizes for a segment
beforehand. Will I be able to deduce these?

> 1) Add an AlgoName field. This is the name for the decoder and encoder 
> implementation, that can be used to decode or re-encode the file. 
> This replaces decoder.name and decoder.encoder in the previous
> implementation.

The Metadata spec should probably list the known values for this
field, and give links to further documentation. A genuine registry for
these may be needed, but I doubt that there will be more than 3
algorithms used in general ...

> * SplitFile.Graph is currently not being used and is not implemented.

Delivering the blueprint for reassembly with every splitfile has it's
appeal, but is probably more redundancy than it's worth.

-- 
Robbe
-- next part --
A non-text attachment was scrubbed...
Name: signature.ng
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Robert Bihlmeyer
Gianni Johansson  writes:

> It's good to make the distinction because if you manage to get all the data 
> blocks you don't need to decode at all.

Prefering data blocks to check blocks will restrict the usefulness of
FEC, as it will make the check blocks less popular, thus less
fetchable, in extreme degenerating to non-FEC splitfiles. I'm with
Fish in the belief that all blocks should be requested in random
order, and with equal probability.

Decoding is a really cheap operation relative to the fetching of the
blocks from Freenet. We shouldn't shun it and make FEC less efficient
in the long run.

> Client writers just don't want to deal with it.

It's certainly a class harder than fetching a monolithic key. I also
found that accurate documentation more accessible than the source is
nonexistent.

> FEC support has been in CVS for almost a year and you are the first
> person who has attempted to write a client besides me.

Well, could have something to do with Freenet being largely unusable
for retrieval of bigger files in that time period. With the current
comeback, more freesites are a-coming, with more FEC splitfiles, so
the need for clients supporting that rises as well.

The FCP access will surely be a good thing. As long as direct access
is still possible ... and I don't see why that should be removed.

-- 
Robbe
-- next part --
A non-text attachment was scrubbed...
Name: signature.ng
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread fish

Most of this sounds pretty good.

Firstly, a stupid question - is there any reason to seperate "data blocks"
and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
blocks, right?  I mean, that's the whole *point* of it (you need any k
blocks of n in order to decode a file).  Would make things simpler,
conceptually, I think.  This of course is based on assumptions from FEC
implentations that I have seen, where the block size is
constant... obviously, if your FEC implentation makes the distinction,
then I guess it makes sense (translation from fishish: yeah, we need to
support checkblocks for certain algorythms, but they don't make sense for
onion's)

Anyhow, the other point I wished to make, is that from looking at your
information, it seems like it would be far more convinent still to just
call the onionnetworks library direclty - okay, yeah, I see the usefulness
of this for providing access to people who don't have/don't want to have
bindings to this, but it just seems like an unnessesary layer of
abstraction to me.  But perhaps I am on crack.

> III. Changes to SplitFile metadata format.
> 
> 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> same size as data blocks and blocks may be different sizes across segments.

I strongly disagree with this - if we want to support this case, it is
better to then have a seperate set of metadata for each segment and
specify the block/check size in each one.  This information is very useful
to have for reasons of memory allocation and the like.

Other than that, however, All of that being said, this all looks okay to
me on my initial reading.

- fish

p.s. I included the following in the original draft of this email, but
considered it offtopic and hence seperated it out from the main
email.  However, I include it here because it is interesting and
semi-relevent:

> For a given maximum block size, some FEC algorithms can only
> practically handle files up to a certain maximum size.  The design
> uses segmentation to handle this case.  Large files are divided into
> smaller segments and FEC is only done on a per segment basis.  This
> compromise provides a least limited redundancy for large files.

Should we be perhaps looking for a library which doens't suffer from these
problems as much as onion's library, however?  The thing is, the whole
usefulness of FEC is for big files, you know... I know that we do have to
deal with reality instead of would-be-nice's, however, but it is something
to think about.

The other problem is, that as you stripe like this, the amount of
redundancy is, of course, reduced significantly, however you already knew
that.  However, people writing bad algorythms for downloading files (block
1,2,3,4 etc in order) could make this problem even worse.  (As a side
note to this, I have been wondering if an AWT based freenet download
manager would be a useful thing to code/have... any thoughts on
this?  Heh, of course, this would require me to learn how to communicate
with freenet from java, but i can't be that hard, can it? :-p)

Anyhow, I'm sure you knew all of this... just restating it for my own
benefit, don't mind me :).  I'll look into the alternative libraries
myself over the next few days... there's nothing to stop us having two
encoders, given that the facilities are there, and let the best codec win
:-p.


___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread fish



On Thu, 12 Sep 2002, Gianni Johansson wrote:

> It's good to make the distinction because if you manage to get all the data 
> blocks you don't need to decode at all.

bob the angry flower says "wrong,m wrong, wrong ,wrong, where did you
learn that??! wrong!!!" :-p.  Seriously, this is a very bad thing to be
encouraging, for two reasons:

a) the checkblocks will fall oiff freenet - you should never be retrieving
this data significantly!  (i wrote about that in the documentation i
wrote, I won't reepat that here)
 
> Client writers just don't want to deal with it.  FEC support has been in CVS 
> for almost a year and you are the first person who has attempted to write a 
> client besides me.

I think that's got more to do with the fact that there was no
documentation, and hence it took me a solid week to work everything out to
the point where it worked where I tested it (and yes, before anyone asks,
I did try just asking GJ first.  4 times (this was back last march.  I
also asked about the fproxy changes, to the same lack of result).  it
seems that any email I send to GJ gets redirected to /dev/null :().  And
then I *still* got it wrong and didn't know until fproxy was patched,
which I found out about when people started bitching about my tools
breaking freenet.

Hell, I had to write *documentation* to get your attention :-p.  Talk
about sinking to a new low :-p.

> > Should we be perhaps looking for a library which doens't suffer from
> these > problems as much as onion's library, however?  The glass is
> half full man.  The math works for files out to about 1G no problem
> even with segmentation for the current implementation.
> 
> The OnionNetworks code works well for moderately sized files now!
> We should make it easily available.

I agree with this... i wasn't trying to suggest otherwise, it was just
random musings. I get emails asking me why it's still hard to insert iso's
into freenet :-p (hey, did you know that onionnetwork's c library will
bluescreen a winxp box when you feed it an iso? :-p)

> Segmenting reduces redundancy not striping.  (see above)

okay, i got the two confused ^_^

- from fish with love.


___
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Matthew Toseland
On Thu, Sep 12, 2002 at 07:37:32PM +1000, fish wrote:
> 
> Most of this sounds pretty good.
> 
> Firstly, a stupid question - is there any reason to seperate "data blocks"
> and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
> blocks, right?  I mean, that's the whole *point* of it (you need any k
> blocks of n in order to decode a file).  Would make things simpler,
> conceptually, I think.  This of course is based on assumptions from FEC
> implentations that I have seen, where the block size is
> constant... obviously, if your FEC implentation makes the distinction,
> then I guess it makes sense (translation from fishish: yeah, we need to
> support checkblocks for certain algorythms, but they don't make sense for
> onion's)
> 
> Anyhow, the other point I wished to make, is that from looking at your
> information, it seems like it would be far more convinent still to just
> call the onionnetworks library direclty - okay, yeah, I see the usefulness
> of this for providing access to people who don't have/don't want to have
> bindings to this, but it just seems like an unnessesary layer of
> abstraction to me.  But perhaps I am on crack.
Hmmm, maybe. It's not well documented afaics? Anyway, higher level
commands (with reasonable status reporting, and some sort of keepalives
or a way to reconnect to a running command, and it only terminate when
explicitly told to), would make clients easier to write, but this looks
useful.
> 
> > III. Changes to SplitFile metadata format.
> > 
> > 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> > same size as data blocks and blocks may be different sizes across segments.
> 
> I strongly disagree with this - if we want to support this case, it is
> better to then have a seperate set of metadata for each segment and
> specify the block/check size in each one.  This information is very useful
> to have for reasons of memory allocation and the like.
> 
> Other than that, however, All of that being said, this all looks okay to
> me on my initial reading.
> 
>   - fish
> 
> p.s. I included the following in the original draft of this email, but
> considered it offtopic and hence seperated it out from the main
> email.  However, I include it here because it is interesting and
> semi-relevent:
> 
> > For a given maximum block size, some FEC algorithms can only
> > practically handle files up to a certain maximum size.  The design
> > uses segmentation to handle this case.  Large files are divided into
> > smaller segments and FEC is only done on a per segment basis.  This
> > compromise provides a least limited redundancy for large files.
> 
> Should we be perhaps looking for a library which doens't suffer from these
> problems as much as onion's library, however?  The thing is, the whole
> usefulness of FEC is for big files, you know... I know that we do have to
> deal with reality instead of would-be-nice's, however, but it is something
> to think about.
> 
> The other problem is, that as you stripe like this, the amount of
> redundancy is, of course, reduced significantly, however you already knew
> that.  However, people writing bad algorythms for downloading files (block
> 1,2,3,4 etc in order) could make this problem even worse.  (As a side
> note to this, I have been wondering if an AWT based freenet download
> manager would be a useful thing to code/have... any thoughts on
> this?  Heh, of course, this would require me to learn how to communicate
> with freenet from java, but i can't be that hard, can it? :-p)
Please do. Do NOT use swing. And include more detail on the status of
the individual blocks, eg bytes downloaded/total progress bar, time this
block idle for etc. Would be very cool.
> 
> Anyhow, I'm sure you knew all of this... just restating it for my own
> benefit, don't mind me :).  I'll look into the alternative libraries
> myself over the next few days... there's nothing to stop us having two
> encoders, given that the facilities are there, and let the best codec win
> :-p.
> 
> 
> 

-- 
Matthew Toseland
mtoseland at blueyonder.co.uk
amphibian at sourceforge.net
Freenet/Coldstore open source hacker.
Employed full time by Freenet Project Inc. from 11/9/02 to 11/11/02.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Robert Bihlmeyer

Gianni Johansson <[EMAIL PROTECTED]> writes:

> It's good to make the distinction because if you manage to get all the data 
> blocks you don't need to decode at all.

Prefering data blocks to check blocks will restrict the usefulness of
FEC, as it will make the check blocks less popular, thus less
fetchable, in extreme degenerating to non-FEC splitfiles. I'm with
Fish in the belief that all blocks should be requested in random
order, and with equal probability.

Decoding is a really cheap operation relative to the fetching of the
blocks from Freenet. We shouldn't shun it and make FEC less efficient
in the long run.

> Client writers just don't want to deal with it.

It's certainly a class harder than fetching a monolithic key. I also
found that accurate documentation more accessible than the source is
nonexistent.

> FEC support has been in CVS for almost a year and you are the first
> person who has attempted to write a client besides me.

Well, could have something to do with Freenet being largely unusable
for retrieval of bigger files in that time period. With the current
comeback, more freesites are a-coming, with more FEC splitfiles, so
the need for clients supporting that rises as well.

The FCP access will surely be a good thing. As long as direct access
is still possible ... and I don't see why that should be removed.

-- 
Robbe



signature.ng
Description: PGP signature


[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Gianni Johansson
On Thursday 12 September 2002 08:40, you wrote:

> > Anyhow, the other point I wished to make, is that from looking at your
> > information, it seems like it would be far more convinent still to just
> > call the onionnetworks library direclty - okay, yeah, I see the
> > usefulness of this for providing access to people who don't have/don't
> > want to have bindings to this, but it just seems like an unnessesary
> > layer of abstraction to me.  But perhaps I am on crack.
>
> Hmmm, maybe. It's not well documented afaics? Anyway, higher level
> commands (with reasonable status reporting, and some sort of keepalives
> or a way to reconnect to a running command, and it only terminate when
> explicitly told to), would make clients easier to write, but this looks
> useful.
Status messages might be a useful addition.  "keepalive", "reconnect" == 
stateful.  You don't want to do that. One of the reasons that FCP is easy is 
that it is stateless.

--gj

___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Gianni Johansson
On Thursday 12 September 2002 05:37, you wrote:
> Most of this sounds pretty good.
>
> Firstly, a stupid question - is there any reason to seperate "data blocks"
> and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
> blocks, right?  I mean, that's the whole *point* of it (you need any k
> blocks of n in order to decode a file).  Would make things simpler,
> conceptually, I think.  This of course is based on assumptions from FEC
> implentations that I have seen, where the block size is
> constant... obviously, if your FEC implentation makes the distinction,
> then I guess it makes sense (translation from fishish: yeah, we need to
> support checkblocks for certain algorythms, but they don't make sense for
> onion's)
It's good to make the distinction because if you manage to get all the data 
blocks you don't need to decode at all.

>
> Anyhow, the other point I wished to make, is that from looking at your
> information, it seems like it would be far more convinent still to just
> call the onionnetworks library direclty - okay, yeah, I see the usefulness
> of this for providing access to people who don't have/don't want to have
> bindings to this, but it just seems like an unnessesary layer of
> abstraction to me.  But perhaps I am on crack.
Client writers just don't want to deal with it.  FEC support has been in CVS 
for almost a year and you are the first person who has attempted to write a 
client besides me.

>
> > III. Changes to SplitFile metadata format.
> >
> > 0) Deprecate the BlockSize field, since check blocks are not necessarily
> > the same size as data blocks and blocks may be different sizes across
> > segments.
>
> I strongly disagree with this - if we want to support this case, it is
> better to then have a seperate set of metadata for each segment and
> specify the block/check size in each one.  This information is very useful
> to have for reasons of memory allocation and the like.
You can still get this info out of the SegmentHeader messages if you want it.

I am leery of bloating the SplitFile metadata any more.

>
> Other than that, however, All of that being said, this all looks okay to
> me on my initial reading.
>
>   - fish
>
> p.s. I included the following in the original draft of this email, but
> considered it offtopic and hence seperated it out from the main
> email.  However, I include it here because it is interesting and
>
> semi-relevent:
> > For a given maximum block size, some FEC algorithms can only
> > practically handle files up to a certain maximum size.  The design
> > uses segmentation to handle this case.  Large files are divided into
> > smaller segments and FEC is only done on a per segment basis.  This
> > compromise provides a least limited redundancy for large files.
>
> Should we be perhaps looking for a library which doens't suffer from these
> problems as much as onion's library, however?  
The glass is half full man.  The math works for files out to about 1G no 
problem even with segmentation for the current implementation.  

The OnionNetworks code works well for moderately sized files now!
We should make it easily available.

I will keep a similar plugin architecture so that new encoder/decoder 
implementations can be used if/when they are written.

> The thing is, the whole
> usefulness of FEC is for big files, you know... I know that we do have to
> deal with reality instead of would-be-nice's, however, but it is something
> to think about.
>
Segmenting reduces redundancy not striping.  (see above)
> The other problem is, that as you stripe like this, the amount of
> redundancy is, of course, reduced significantly, however you already knew
> that.  However, people writing bad algorythms for downloading files (block
> 1,2,3,4 etc in order) could make this problem even worse.  (As a side
> note to this, I have been wondering if an AWT based freenet download
> manager would be a useful thing to code/have... any thoughts on
> this?  Heh, of course, this would require me to learn how to communicate
> with freenet from java, but i can't be that hard, can it? :-p)
>
> Anyhow, I'm sure you knew all of this... just restating it for my own
> benefit, don't mind me :).  I'll look into the alternative libraries
> myself over the next few days... there's nothing to stop us having two
> encoders, given that the facilities are there, and let the best codec win.
Cool.
>
> :-p.
>
> ___
> devl mailing list
> devl at freenetproject.org
> http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl

___
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Gianni Johansson

On Thursday 12 September 2002 08:40, you wrote:

> > Anyhow, the other point I wished to make, is that from looking at your
> > information, it seems like it would be far more convinent still to just
> > call the onionnetworks library direclty - okay, yeah, I see the
> > usefulness of this for providing access to people who don't have/don't
> > want to have bindings to this, but it just seems like an unnessesary
> > layer of abstraction to me.  But perhaps I am on crack.
>
> Hmmm, maybe. It's not well documented afaics? Anyway, higher level
> commands (with reasonable status reporting, and some sort of keepalives
> or a way to reconnect to a running command, and it only terminate when
> explicitly told to), would make clients easier to write, but this looks
> useful.
Status messages might be a useful addition.  "keepalive", "reconnect" == 
stateful.  You don't want to do that. One of the reasons that FCP is easy is 
that it is stateless.

--gj

___
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Gianni Johansson

On Thursday 12 September 2002 05:37, you wrote:
> Most of this sounds pretty good.
>
> Firstly, a stupid question - is there any reason to seperate "data blocks"
> and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
> blocks, right?  I mean, that's the whole *point* of it (you need any k
> blocks of n in order to decode a file).  Would make things simpler,
> conceptually, I think.  This of course is based on assumptions from FEC
> implentations that I have seen, where the block size is
> constant... obviously, if your FEC implentation makes the distinction,
> then I guess it makes sense (translation from fishish: yeah, we need to
> support checkblocks for certain algorythms, but they don't make sense for
> onion's)
It's good to make the distinction because if you manage to get all the data 
blocks you don't need to decode at all.

>
> Anyhow, the other point I wished to make, is that from looking at your
> information, it seems like it would be far more convinent still to just
> call the onionnetworks library direclty - okay, yeah, I see the usefulness
> of this for providing access to people who don't have/don't want to have
> bindings to this, but it just seems like an unnessesary layer of
> abstraction to me.  But perhaps I am on crack.
Client writers just don't want to deal with it.  FEC support has been in CVS 
for almost a year and you are the first person who has attempted to write a 
client besides me.

>
> > III. Changes to SplitFile metadata format.
> >
> > 0) Deprecate the BlockSize field, since check blocks are not necessarily
> > the same size as data blocks and blocks may be different sizes across
> > segments.
>
> I strongly disagree with this - if we want to support this case, it is
> better to then have a seperate set of metadata for each segment and
> specify the block/check size in each one.  This information is very useful
> to have for reasons of memory allocation and the like.
You can still get this info out of the SegmentHeader messages if you want it.

I am leery of bloating the SplitFile metadata any more.

>
> Other than that, however, All of that being said, this all looks okay to
> me on my initial reading.
>
>   - fish
>
> p.s. I included the following in the original draft of this email, but
> considered it offtopic and hence seperated it out from the main
> email.  However, I include it here because it is interesting and
>
> semi-relevent:
> > For a given maximum block size, some FEC algorithms can only
> > practically handle files up to a certain maximum size.  The design
> > uses segmentation to handle this case.  Large files are divided into
> > smaller segments and FEC is only done on a per segment basis.  This
> > compromise provides a least limited redundancy for large files.
>
> Should we be perhaps looking for a library which doens't suffer from these
> problems as much as onion's library, however?  
The glass is half full man.  The math works for files out to about 1G no 
problem even with segmentation for the current implementation.  

The OnionNetworks code works well for moderately sized files now!
We should make it easily available.

I will keep a similar plugin architecture so that new encoder/decoder 
implementations can be used if/when they are written.

> The thing is, the whole
> usefulness of FEC is for big files, you know... I know that we do have to
> deal with reality instead of would-be-nice's, however, but it is something
> to think about.
>
Segmenting reduces redundancy not striping.  (see above)
> The other problem is, that as you stripe like this, the amount of
> redundancy is, of course, reduced significantly, however you already knew
> that.  However, people writing bad algorythms for downloading files (block
> 1,2,3,4 etc in order) could make this problem even worse.  (As a side
> note to this, I have been wondering if an AWT based freenet download
> manager would be a useful thing to code/have... any thoughts on
> this?  Heh, of course, this would require me to learn how to communicate
> with freenet from java, but i can't be that hard, can it? :-p)
>
> Anyhow, I'm sure you knew all of this... just restating it for my own
> benefit, don't mind me :).  I'll look into the alternative libraries
> myself over the next few days... there's nothing to stop us having two
> encoders, given that the facilities are there, and let the best codec win.
Cool.
>
> :-p.
>
> ___
> devl mailing list
> [EMAIL PROTECTED]
> http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl

___
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Matthew Toseland

On Thu, Sep 12, 2002 at 07:37:32PM +1000, fish wrote:
> 
> Most of this sounds pretty good.
> 
> Firstly, a stupid question - is there any reason to seperate "data blocks"
> and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
> blocks, right?  I mean, that's the whole *point* of it (you need any k
> blocks of n in order to decode a file).  Would make things simpler,
> conceptually, I think.  This of course is based on assumptions from FEC
> implentations that I have seen, where the block size is
> constant... obviously, if your FEC implentation makes the distinction,
> then I guess it makes sense (translation from fishish: yeah, we need to
> support checkblocks for certain algorythms, but they don't make sense for
> onion's)
> 
> Anyhow, the other point I wished to make, is that from looking at your
> information, it seems like it would be far more convinent still to just
> call the onionnetworks library direclty - okay, yeah, I see the usefulness
> of this for providing access to people who don't have/don't want to have
> bindings to this, but it just seems like an unnessesary layer of
> abstraction to me.  But perhaps I am on crack.
Hmmm, maybe. It's not well documented afaics? Anyway, higher level
commands (with reasonable status reporting, and some sort of keepalives
or a way to reconnect to a running command, and it only terminate when
explicitly told to), would make clients easier to write, but this looks
useful.
> 
> > III. Changes to SplitFile metadata format.
> > 
> > 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> > same size as data blocks and blocks may be different sizes across segments.
> 
> I strongly disagree with this - if we want to support this case, it is
> better to then have a seperate set of metadata for each segment and
> specify the block/check size in each one.  This information is very useful
> to have for reasons of memory allocation and the like.
> 
> Other than that, however, All of that being said, this all looks okay to
> me on my initial reading.
> 
>   - fish
> 
> p.s. I included the following in the original draft of this email, but
> considered it offtopic and hence seperated it out from the main
> email.  However, I include it here because it is interesting and
> semi-relevent:
> 
> > For a given maximum block size, some FEC algorithms can only
> > practically handle files up to a certain maximum size.  The design
> > uses segmentation to handle this case.  Large files are divided into
> > smaller segments and FEC is only done on a per segment basis.  This
> > compromise provides a least limited redundancy for large files.
> 
> Should we be perhaps looking for a library which doens't suffer from these
> problems as much as onion's library, however?  The thing is, the whole
> usefulness of FEC is for big files, you know... I know that we do have to
> deal with reality instead of would-be-nice's, however, but it is something
> to think about.
> 
> The other problem is, that as you stripe like this, the amount of
> redundancy is, of course, reduced significantly, however you already knew
> that.  However, people writing bad algorythms for downloading files (block
> 1,2,3,4 etc in order) could make this problem even worse.  (As a side
> note to this, I have been wondering if an AWT based freenet download
> manager would be a useful thing to code/have... any thoughts on
> this?  Heh, of course, this would require me to learn how to communicate
> with freenet from java, but i can't be that hard, can it? :-p)
Please do. Do NOT use swing. And include more detail on the status of
the individual blocks, eg bytes downloaded/total progress bar, time this
block idle for etc. Would be very cool.
> 
> Anyhow, I'm sure you knew all of this... just restating it for my own
> benefit, don't mind me :).  I'll look into the alternative libraries
> myself over the next few days... there's nothing to stop us having two
> encoders, given that the facilities are there, and let the best codec win
> :-p.
> 
> 
> 

-- 
Matthew Toseland
[EMAIL PROTECTED]
[EMAIL PROTECTED]
Freenet/Coldstore open source hacker.
Employed full time by Freenet Project Inc. from 11/9/02 to 11/11/02.



msg03976/pgp0.pgp
Description: PGP signature


[freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread Gianni Johansson
FCP FEC Proposal rev. 1.0
giannijohansson at attbi.com 20020912

I. INTRODUCTION:

This proposal presents a set of new FCP commands that can be used to
encode and decode files using forward error correction (FEC).

FEC is a way of encoding packetized data files with extra error
recovery information which can be used to reconstruct lost packets.
In this document I will refer to the packets containing data as "data
blocks" and those containing error recovery information as "check
blocks".

One of the objectives of this design is to separate FEC encoding and
decoding from inserting and retrieving the data and check blocks
to/from Freenet.  By separating encoding and decoding from insertion
and retrieval I sidestep the problem of having to hold FCP connections
open while waiting for large amounts of data to be fetched from /
inserted into Freenet.

For a given maximum block size, some FEC algorithms can only
practically handle files up to a certain maximum size.  The design
uses segmentation to handle this case.  Large files are divided into
smaller segments and FEC is only done on a per segment basis.  This
compromise provides a least limited redundancy for large files.

II. Assumptions
This proposal doesn't specify any particular FEC algorithm.  
However the following assumptions are implicit in the design:

A. For a given segment with k data blocks and n - k check blocks, it
must be possible to decode all k data blocks from any k of n data or
check blocks.

B. Encoder and decoder implementations must be completely specified by
an implementation name and a file length.  No other parameters can be
required to instantiate the encoder or decoder.

C. Within a segment all data blocks must be the same size and all
check blocks must be the same size.  The check block and data block
sizes are not required to be the same however.  Smaller trailing
blocks must be zero padded to the required length.

D. The encoder may ask for extra trailing data blocks.  These extra
blocks must contain zeros.

II. Proposed FCP FEC commands

convention: All numbers are hexadecimal

A. Helper messages, SegmentHeader and BlockMap

A SegmentHeader message contains all the information necessary to FEC
encode or decode a segment of a file.  SegmentHeaders may contain FEC
implementation specific fields.  They are guaranteed to contain the
documented fields given in the example SegmentHeader
message below:

SegmentHeader
FECAlgorithm=OnionFEC_a_1_2   // The FEC implementation name
FileLength=17 // Total file length
Offset=0  // Offset from the start of the file
BlockCount=6  // Number of data blocks
BlockSize=4   // Data block size
CheckBlockCount=3 // Number of check blocks
CheckBlockSize=3  // Check block size
Segments=1// Total number of segments
SegmentNum=0  // Index of the current segment
BlocksRequired=6  // Blocks required to decode this segment
EndMessage

Client code should not rely on any undocumented fields.

BlockMap messages are used to list the CHKs of the data and check
blocks for a segment.

Here's an example:

BlockMap
Block.0=freenet:CHK at p2ISvZPkCwbY62xciJb~KrsOCTsSAwI,jGonMeCCz1GCHde5bc1t~w
Block.1=freenet:CHK at 1z8CubDNzLEfNfuTYM4NVJAUxU4SAwI,5cxWki4YzWyKP0s3g9~Vow
Block.2=freenet:CHK@~VW7XskmHcJMFlmG6l2c7jkTOnkSAwI,Il2ztTbQImZvVlsnuDq-8Q
Block.3=freenet:CHK at A-qK8GWofXd9JOxb4fHfVMHAUawSAwI,2D5~Mm~MjAfup3edGXy6Eg
Block.4=freenet:CHK at r-FhUu444LxUIUGi5BMuEVGM4nQSAwI,J7HpLvPscLyW3Sc6Nq2S5g
Check.0=freenet:CHK at rLdCwOXO7PAv6BDpm21ThdIwmnkSAwI,4ZX2inJ7gg0EectTxPYRSg
Check.1=freenet:CHK at EjEg1UHWsAfHHMQmRbxe2ToY0RQSAwI,xjJCPsCxpnw9lyNI2VBRGA
EndMessage

B. FECSegmentFile
The FECSegmentFile message is used to generate the segment headers
necessary to encode a file of a given length with a specified FEC
algorithm.  

FECSegmentFile
AlgoName=OnionFEC_a_1_2
FileLength=ABC123
EndMessage

If this command is successful one or more SegmentHeader messages are sent in 
order
of ascending SegmentNumber.

The client can detect when the last segment has been sent by checking the 
SegmentNumber
and Segments field of each received SegmentHeader.

On failure a Failed message is sent.

C. FECEncodeSegment
The FECEncodeSegment message is used to create check blocks for a
segment of a file.  The RequestedList field contains a comma delimited
list of the requested check blocks. If the list is empty or omitted completely
all the check blocks are sent.

The SegmentHeader for the requested segment must sent as data in the
trailing field of the FECEncodeSegment message, preceding the raw
segment data to encode.

FECEncodeSegment
[RequestedList=0,A,F]
DataLength= + 
Data

< SegmentHeader >
< raw data >

If the encode request is successful, the server sends a BlocksEncoded
confirmation message, followed by DataChunk messages for the encoded
blocks.  Check blocks are sent in order of ascending index.

e

Re: [freenet-dev] FCP FEC Proposal -- in message body

2002-09-12 Thread fish


Most of this sounds pretty good.

Firstly, a stupid question - is there any reason to seperate "data blocks"
and "check blocks"?  As far as the FEC encoder/decoder knows, they're just
blocks, right?  I mean, that's the whole *point* of it (you need any k
blocks of n in order to decode a file).  Would make things simpler,
conceptually, I think.  This of course is based on assumptions from FEC
implentations that I have seen, where the block size is
constant... obviously, if your FEC implentation makes the distinction,
then I guess it makes sense (translation from fishish: yeah, we need to
support checkblocks for certain algorythms, but they don't make sense for
onion's)

Anyhow, the other point I wished to make, is that from looking at your
information, it seems like it would be far more convinent still to just
call the onionnetworks library direclty - okay, yeah, I see the usefulness
of this for providing access to people who don't have/don't want to have
bindings to this, but it just seems like an unnessesary layer of
abstraction to me.  But perhaps I am on crack.

> III. Changes to SplitFile metadata format.
> 
> 0) Deprecate the BlockSize field, since check blocks are not necessarily the
> same size as data blocks and blocks may be different sizes across segments.

I strongly disagree with this - if we want to support this case, it is
better to then have a seperate set of metadata for each segment and
specify the block/check size in each one.  This information is very useful
to have for reasons of memory allocation and the like.

Other than that, however, All of that being said, this all looks okay to
me on my initial reading.

- fish

p.s. I included the following in the original draft of this email, but
considered it offtopic and hence seperated it out from the main
email.  However, I include it here because it is interesting and
semi-relevent:

> For a given maximum block size, some FEC algorithms can only
> practically handle files up to a certain maximum size.  The design
> uses segmentation to handle this case.  Large files are divided into
> smaller segments and FEC is only done on a per segment basis.  This
> compromise provides a least limited redundancy for large files.

Should we be perhaps looking for a library which doens't suffer from these
problems as much as onion's library, however?  The thing is, the whole
usefulness of FEC is for big files, you know... I know that we do have to
deal with reality instead of would-be-nice's, however, but it is something
to think about.

The other problem is, that as you stripe like this, the amount of
redundancy is, of course, reduced significantly, however you already knew
that.  However, people writing bad algorythms for downloading files (block
1,2,3,4 etc in order) could make this problem even worse.  (As a side
note to this, I have been wondering if an AWT based freenet download
manager would be a useful thing to code/have... any thoughts on
this?  Heh, of course, this would require me to learn how to communicate
with freenet from java, but i can't be that hard, can it? :-p)

Anyhow, I'm sure you knew all of this... just restating it for my own
benefit, don't mind me :).  I'll look into the alternative libraries
myself over the next few days... there's nothing to stop us having two
encoders, given that the facilities are there, and let the best codec win
:-p.


___
devl mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/devl



[freenet-dev] FCP FEC Proposal -- in message body

2002-09-11 Thread Gianni Johansson

FCP FEC Proposal rev. 1.0
[EMAIL PROTECTED] 20020912

I. INTRODUCTION:

This proposal presents a set of new FCP commands that can be used to
encode and decode files using forward error correction (FEC).

FEC is a way of encoding packetized data files with extra error
recovery information which can be used to reconstruct lost packets.
In this document I will refer to the packets containing data as "data
blocks" and those containing error recovery information as "check
blocks".

One of the objectives of this design is to separate FEC encoding and
decoding from inserting and retrieving the data and check blocks
to/from Freenet.  By separating encoding and decoding from insertion
and retrieval I sidestep the problem of having to hold FCP connections
open while waiting for large amounts of data to be fetched from /
inserted into Freenet.

For a given maximum block size, some FEC algorithms can only
practically handle files up to a certain maximum size.  The design
uses segmentation to handle this case.  Large files are divided into
smaller segments and FEC is only done on a per segment basis.  This
compromise provides a least limited redundancy for large files.

II. Assumptions
This proposal doesn't specify any particular FEC algorithm.  
However the following assumptions are implicit in the design:
 
A. For a given segment with k data blocks and n - k check blocks, it
must be possible to decode all k data blocks from any k of n data or
check blocks.

B. Encoder and decoder implementations must be completely specified by
an implementation name and a file length.  No other parameters can be
required to instantiate the encoder or decoder.

C. Within a segment all data blocks must be the same size and all
check blocks must be the same size.  The check block and data block
sizes are not required to be the same however.  Smaller trailing
blocks must be zero padded to the required length.

D. The encoder may ask for extra trailing data blocks.  These extra
blocks must contain zeros.

II. Proposed FCP FEC commands

convention: All numbers are hexadecimal

A. Helper messages, SegmentHeader and BlockMap

A SegmentHeader message contains all the information necessary to FEC
encode or decode a segment of a file.  SegmentHeaders may contain FEC
implementation specific fields.  They are guaranteed to contain the
documented fields given in the example SegmentHeader
message below:

SegmentHeader
FECAlgorithm=OnionFEC_a_1_2   // The FEC implementation name
FileLength=17 // Total file length
Offset=0  // Offset from the start of the file
BlockCount=6  // Number of data blocks
BlockSize=4   // Data block size
CheckBlockCount=3 // Number of check blocks
CheckBlockSize=3  // Check block size
Segments=1// Total number of segments
SegmentNum=0  // Index of the current segment
BlocksRequired=6  // Blocks required to decode this segment
EndMessage

Client code should not rely on any undocumented fields.

BlockMap messages are used to list the CHKs of the data and check
blocks for a segment.

Here's an example:

BlockMap
Block.0=freenet:CHK@p2ISvZPkCwbY62xciJb~KrsOCTsSAwI,jGonMeCCz1GCHde5bc1t~w
Block.1=freenet:CHK@1z8CubDNzLEfNfuTYM4NVJAUxU4SAwI,5cxWki4YzWyKP0s3g9~Vow
Block.2=freenet:CHK@~VW7XskmHcJMFlmG6l2c7jkTOnkSAwI,Il2ztTbQImZvVlsnuDq-8Q
Block.3=freenet:CHK@A-qK8GWofXd9JOxb4fHfVMHAUawSAwI,2D5~Mm~MjAfup3edGXy6Eg
Block.4=freenet:CHK@r-FhUu444LxUIUGi5BMuEVGM4nQSAwI,J7HpLvPscLyW3Sc6Nq2S5g
Check.0=freenet:CHK@rLdCwOXO7PAv6BDpm21ThdIwmnkSAwI,4ZX2inJ7gg0EectTxPYRSg
Check.1=freenet:CHK@EjEg1UHWsAfHHMQmRbxe2ToY0RQSAwI,xjJCPsCxpnw9lyNI2VBRGA
EndMessage

B. FECSegmentFile
The FECSegmentFile message is used to generate the segment headers
necessary to encode a file of a given length with a specified FEC
algorithm.  

FECSegmentFile
AlgoName=OnionFEC_a_1_2
FileLength=ABC123
EndMessage

If this command is successful one or more SegmentHeader messages are sent in 
order
of ascending SegmentNumber.

The client can detect when the last segment has been sent by checking the 
SegmentNumber
and Segments field of each received SegmentHeader.

On failure a Failed message is sent.

C. FECEncodeSegment
The FECEncodeSegment message is used to create check blocks for a
segment of a file.  The RequestedList field contains a comma delimited
list of the requested check blocks. If the list is empty or omitted completely
all the check blocks are sent.

The SegmentHeader for the requested segment must sent as data in the
trailing field of the FECEncodeSegment message, preceding the raw
segment data to encode.

FECEncodeSegment
[RequestedList=0,A,F]
DataLength= + 
Data

< SegmentHeader >
< raw data >

If the encode request is successful, the server sends a BlocksEncoded
confirmation message, followed by DataChunk messages for the encoded
blocks.  Check blocks are sent in order of ascending index.

e.g:

BlocksEncoded
BlockCou