On Thu, 23 Jan 2003, Matthew Toseland wrote:

> On Thu, Jan 23, 2003 at 11:31:25AM +0000, Gordan Bobic wrote:
> > Hi.
> > 
> > Is there such a thing as chunk size in the way Freenet deals with storing 
> > and transfering the data?
> 
> Freenet key contents are normally some fields plus a power of two size
> data. Minimum 1kB, I believe.

Is there a level of segmentation performed by default? Say, if the file is 
65KB, will it be stored as a single 65KB file, and always accessed as a 
SINGLE entity, or will it be segmented by the network into multiple powers 
of two size chunks (minimum 1 KB)?

> > By chunk size, I mean a minimal unit of data that is worked with. For 
> > example, for disk access, this would be the size of the inode.
> 
> The other interesting parameter here is the default datastore size.
> Freenet nodes should generally not have a store less than a little over
> 200MB. The default is 256MB. The maximum size chunk that a datastore
> will store is 1/200th of the store size. Hence, a file split into 1MB
> chunks (or less) will be cacheable on all nodes it goes through.

That doesn't bother me, because I am trying to optimize things down to a 
_minimum_ sensible file size, rather than maximum possible file size.

Are files never separated into segments, unless FEC is used? What are the 
minimum and maximum sizes for FEC segments?

> > I am trying to determine what is the optimal size to split data into. The 
> > size I am looking for is the one that implies that (file size) == (block 
> > size), so that if a block gets lost, the whole file (not just a part of 
> > it) is gone.
> 
> Hmm. Use a power of two. A file and a block are no different on
> freenet; we delete the files when they reach the end of the LRU on a
> full datastore and we need more space.

I understand that. I just wanted to know if files are always treated as a 
single entity (except concerning FEC), or if they were always segmented 
into blocks, and different blocks could come from different nodes (again, 
without using FEC).

> However if you need to store
> chunks of more than a meg, you need to use redundant (FEC) splitfiles.

As I said, I was looking for the limits on the small size, rather than the 
large size. Now I know not to go much below 1 KB because it is pointless. 
I doubt I'd ever need to use anything even remotely approaching 1 MB for 
my application. I was thinking about using a size between 1 KB and 4 KB, 
but wasn't sure if the minimum block size might have been something 
quite a bit larger, like 64 KB.

> > The reason for this is that I am trying to design a database application 
> > that uses Frenet as the storage medium (yes, I know about FreeSQL, and it 
> > doesn't do what I want in the way I want it done). Files going missing are 
> > an obvious problem that needs to be tackled. I'd like to know what the 
> > block size is in order to implement redundancy padding in the data by 
> > exploiting the overheads produced by the block size, when a single item of 
> > data is smaller than the block that contains it.
> 
> Cool. See above.

When you say powers of two - does that mean that a 5 KB file will be 
rounded to 8 KB? Or be split into 4 KB and 1 KB? If it is split, at what 
point is it split, and how is this handled on the storage and network 
transfer levels?

> > This could be optimized out in run-time to make no impact on execution 
> > speed (e.g. skip downloads of blocks that we can reconstruct from already 
> > downloaded segments).
> 
> Hmm. Not sure I follow.

A bit like a hamming code, but allowing random access. Because it is 
latency + download that is slow, downloading fewer files is a good 
thing for performance, so I can re-construct some of the pending segments 
rather than downloading them. Very much like FEC, in fact. :-)

Of course, I might not bother if I can use FEC for it instead, provided it 
will work with very small file sizes (question I asked above).

Thanks.

Gordan


_______________________________________________
Tech mailing list
[EMAIL PROTECTED]
http://hawk.freenetproject.org/cgi-bin/mailman/listinfo/tech

Reply via email to