On Tuesday 12 May 2009 21:26:53 Ximin Luo wrote:
> Matthew Toseland wrote:
> > Is it a good idea to use MD5? I guess you're using it the same way that 
> > XMLLibrarian does, but it may be more of a problem for your application?
> 
> you mean collisions? with md5 the expected rate of collisions is around 1 in
> 2^64, so that would give an average word length of 4 assuming 2^16 possible
> letters, or word length of 8 with 2^8 possible letters... it seems ok, but
> maybe sha1 is safer, then.

What about deliberate collisions? Could they cause you to use more memory or 
anything like that? md5 is broken...
> 
> in any case, the actual plain keyword would be stored with each index so
> collisions could be detected.
> 
> > From your docs...
> > An IndexNode corresponds to an SSK subspace, it contains a filter (e.g. 
bloom 
> > filter) for quickly ruling out an index based on the sought keywords, and 
a 
> > bunch of entries, which can each either be redirects to other indexes, or 
can 
> > be pointers to files.
> 
> what's a "subspace" - the entire first "directory" of an SSK, 

Yes.

> or any 
> subdirectory at any level? (i'm not familiar with the terminology here, 
sorry)
> an IndexTree is the former; an IndexNode is the latter.
> 
> yes, entries can be redirects to other indexes, or point to files that 
contain
> the actual index data = {(keyword, freenetURI that matches it, other 
relevant
> information)*}. is this what you meant by files? they *don't* point to
> non-index files of content.

Hmmm, ok.
> 
> > What does it mean to inflate or deflate the index?
> 
> inflate = REQ the relevant data from freenet, and use it to build internal 
data
> structure
> 
> deflate = INS the internal data structure into freenet. for the 
SSKSerialiser,
> SSK/USKs can't be partially updated without updating the whole subspace, (or 
so
> i thought), which is why token-deflate throws UnsupportedOperationException.

Ok.
> 
> (one way of storing it which would allow token-deflate would be having each
> indexnode as a CHK, then you'd only have to INS an updated node and all its
> parents up to the root, but i chose not to do this as CHKs have a higher 
limit
> for being turned into a splitfile. was this the right decision?)

Or you could store them as separate SSKs, but I wouldn't recommend it. SSKs 
can have any name after the slash. But inserting it all at once adds 
redundancy etc, it's generally a good idea.

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to