Re: [freenet-dev] Insert of demand application design

Matthew Toseland Wed, 03 Jun 2009 14:26:20 -0700

On Tuesday 02 June 2009 19:41:51 clement wrote:
> Hello,
> 
> Here is some try from Sich and myself to think about the design for some 
> insert on demand application based on the WoT plugin.


You should seriously consider working with infinity0, his searching plugin will 
provide distributed indexing.
> 
> 
> -----------------------------------------------------------------------------------------------
> 
> key structure :
> 
> ssk/filesharing/updates/inserts/x/inserts
> ssk/filesharing/updates/requests/y/requests
> ssk/filesharing/z/index/
> ssk/filesharing/filehash/n/index
> 
> inserts file structure:
> <filehash>xxx</filehash>
> <chunk>all or chunk number</chunk>
> 
> requests file structure:
> <filehash>xxx</filehash>
> <chunk>all or chunk number</chunk>
> 
> index file structure:
> <file>
> <name>xxx</name>
> <size>xxx</size>
> <hash>xxx</hash>
> <content type>xxx</content type>
> </file>
> 
> filehash index structure:
> <comment>comment on this file</comment>
> #one per chunk
> <chunk>
> <hash>xxx</hash>
> <ssk>xxx</ssk> #if available
> <chunk number>xxx</chunk number> #necessary if all the chunks aren't listed
> </chunk>
> 
> Each file is splited into several chunks. Each chunks is inserted in is own 
> SSK.

Why are you inserting files as SSKs? Security?

Why are you splitting the files up? Are you assuming that the key changes every 
time for security? If you are using CHKs you can simply reinsert the original 
file - we can provide an FCP option to only reinsert some blocks, this is not a 
big problem. The advantage is that if the data has been inserted, you can just 
download it, using the normal CHK key, and if it hasn't, and people start 
reinserting it, you will be able to pick up those blocks. The disadvantage is 
security: anyone who inserts predictable keys is vulnerable to attack. However, 
to avoid such vulnerability, you need to *encrypt the inserted data differently 
each time*! I am assuming you are using chunks consisting of many CHKs, maybe 
1MB, with an SSK pointing to them? In which case the chunk will need to be 
encrypted before being inserted.

> We suscribe to every keys needed to know about updates (that is :
> ssk/filesharing/upadtes/inserts and ssk/filesharing/upadtes/requests)
> 
> Search :
> 
> Each people publish all the files that he is sharing in the 
> ssk/filesharing/index file.
> 
> When you search a file, you will look in each identity to find your file. 
> Then 
> you have a list of filename and corresponding filehash.
> When you have this you can choose to download a specific filehash.
> 
> #Each people who have this file can begin to insert some chunk of the file 
> and 
> telling that they are currently inserting a part of the file. The chunk are 
> randomly choose, then multiple people can insert #the whole file more faster.
> #When the chunk is inserted the SSK key of the chunk is published, then we 
> can 
> begin to download it.

They only publish the SSK after they have finished inserting the chunk? Ok.
> 
> Download:
> 
> Once you found the file you want, you search for shared chunks in the keys of 
> the identities sharing the file.
> If several ssk are available for one chunk, choose the one that appears the 
> more. (during the search, when you see sskY for chunkXXX, just add 1 to the 
> priority of sskY)
> 
> If no ssk is available for one chunk, add a request in 
> ssk/filesharing/requests
> 
> We suscribe to the key : ssk/filesharing/filehash/ to know about all new 
> chunks available, ...
> 
> Insertion:
> 
> If someone is requesting a chunk (see Download), we start inserting it.
> We publish that in the key under : ssk/filesharing/inserts, so other 
> identities won't insert it.
> When it's finished, we indicate it under the same file.
> 
> Healing :
> 
> When you try to download a file who was already inserted, and if you can't 
> download a specific chunk, a request is publish to ask for this chunk. People 
> who have the original file can begin to insert this chunk and tell the other 
> that they have begin to insert it in order to avoid multiple insert of the 
> same chunk. Then the new SSK is published to download the missing chunk.
> 
> WoT:
> 
> If someone give us a wrong chunk or some fake, we mark it as bad. One 
> question 
> however : how can we detect a wrong chunk ?
> If we have multiple source for the file we can try to compare the chunk 
> filehash index (ssk/filesharing/filehash/index).

Yes, you need an overall hash of the file contents, and maybe for the chunks 
too, assuming they are big enough.
> 
> 
> advantage to split the file :
>  - We can have multiple source to insert the whole file
>  - If some chunk are no more available we can only ask to reinsert this 
> chunk. 
> This will limit the datastore use (no need to reinsert the whole file on a 
> new 
> SSK)

These are advantages of selective reinsertion, which can be implemented over 
FCP with normal keys.

>  - We can begin to download the file before all the chunks are inserted. This 
> is a little more faster for the people who are downloading.

This is true. It can be safe if the CHKs that make up each chunk are encrypted 
with a random key and are therefore not detectable until that chunk is 
announced.

>  - Using some preview system ?

Maybe.
> 
> WoT-based :
>  - Using WOT can limit the problem for the bootstrap when you have just 
> install the filesharing program. 
>  - Using the security of the WOT can help to find people who are publishing 
> fake files.
> 
> --------------------------------------------------------------------------------------------------------
> 
> Please don't hesitate to comment. Are we missing some points, is that 
> feasable, will it be too slow, etc...

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Insert of demand application design

Reply via email to