On Wed, May 13, 2009 at 6:28 AM, Matthew Toseland
<[email protected]> wrote:
> On Tuesday 12 May 2009 21:26:53 Ximin Luo wrote:
>> Matthew Toseland wrote:
>> > Is it a good idea to use MD5? I guess you're using it the same way that
>> > XMLLibrarian does, but it may be more of a problem for your application?
>>
>> you mean collisions? with md5 the expected rate of collisions is around 1 in
>> 2^64, so that would give an average word length of 4 assuming 2^16 possible
>> letters, or word length of 8 with 2^8 possible letters... it seems ok, but
>> maybe sha1 is safer, then.
>
> What about deliberate collisions? Could they cause you to use more memory or
> anything like that? md5 is broken...
2^64 is still much larger then our bloom filter size.
This is just a performance tricks, not a security measure.
I can't see why we can't use MD5 here.
>> in any case, the actual plain keyword would be stored with each index so
>> collisions could be detected.
>>
>> > From your docs...
>> > An IndexNode corresponds to an SSK subspace, it contains a filter (e.g.
> bloom
>> > filter) for quickly ruling out an index based on the sought keywords, and
> a
>> > bunch of entries, which can each either be redirects to other indexes, or
> can
>> > be pointers to files.
>>
>> what's a "subspace" - the entire first "directory" of an SSK,
>
> Yes.
>
>> or any
>> subdirectory at any level? (i'm not familiar with the terminology here,
> sorry)
>> an IndexTree is the former; an IndexNode is the latter.
>>
>> yes, entries can be redirects to other indexes, or point to files that
> contain
>> the actual index data = {(keyword, freenetURI that matches it, other
> relevant
>> information)*}. is this what you meant by files? they *don't* point to
>> non-index files of content.
>
> Hmmm, ok.
>>
>> > What does it mean to inflate or deflate the index?
>>
>> inflate = REQ the relevant data from freenet, and use it to build internal
> data
>> structure
>>
>> deflate = INS the internal data structure into freenet. for the
> SSKSerialiser,
>> SSK/USKs can't be partially updated without updating the whole subspace, (or
> so
>> i thought), which is why token-deflate throws UnsupportedOperationException.
>
> Ok.
>>
>> (one way of storing it which would allow token-deflate would be having each
>> indexnode as a CHK, then you'd only have to INS an updated node and all its
>> parents up to the root, but i chose not to do this as CHKs have a higher
> limit
>> for being turned into a splitfile. was this the right decision?)
>
> Or you could store them as separate SSKs, but I wouldn't recommend it. SSKs
> can have any name after the slash. But inserting it all at once adds
> redundancy etc, it's generally a good idea.
>
--
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl