Matthew Toseland wrote:
> Is it a good idea to use MD5? I guess you're using it the same way that
> XMLLibrarian does, but it may be more of a problem for your application?
you mean collisions? with md5 the expected rate of collisions is around 1 in
2^64, so that would give an average word length of 4 assuming 2^16 possible
letters, or word length of 8 with 2^8 possible letters... it seems ok, but
maybe sha1 is safer, then.
in any case, the actual plain keyword would be stored with each index so
collisions could be detected.
> From your docs...
> An IndexNode corresponds to an SSK subspace, it contains a filter (e.g. bloom
> filter) for quickly ruling out an index based on the sought keywords, and a
> bunch of entries, which can each either be redirects to other indexes, or can
> be pointers to files.
what's a "subspace" - the entire first "directory" of an SSK, or any
subdirectory at any level? (i'm not familiar with the terminology here, sorry)
an IndexTree is the former; an IndexNode is the latter.
yes, entries can be redirects to other indexes, or point to files that contain
the actual index data = {(keyword, freenetURI that matches it, other relevant
information)*}. is this what you meant by files? they *don't* point to
non-index files of content.
> What does it mean to inflate or deflate the index?
inflate = REQ the relevant data from freenet, and use it to build internal data
structure
deflate = INS the internal data structure into freenet. for the SSKSerialiser,
SSK/USKs can't be partially updated without updating the whole subspace, (or so
i thought), which is why token-deflate throws UnsupportedOperationException.
(one way of storing it which would allow token-deflate would be having each
indexnode as a CHK, then you'd only have to INS an updated node and all its
parents up to the root, but i chose not to do this as CHKs have a higher limit
for being turned into a splitfile. was this the right decision?)
X
_______________________________________________
Devl mailing list
[email protected]
http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl