[Freenet-dev] Datastore formats and scalability

Oskar Sandberg Fri, 4 Aug 2000 20:00:33 +0700

On Fri, 04 Aug 2000, Travis Bemann wrote:
> On Fri, Aug 04, 2000 at 02:08:53PM +0700, Oskar Sandberg wrote:
<> 
> > The easy part of the DataStore is the lookups. Each key (20 bytes) entry 
> > needs
> > to have an address (4 bytes) of another node and a reference to where the
> > actual data is stored (not sure, but to be safe say 16 bytes). That is 40 
> > bytes
> > per entry. In other words, an index for a currently standard node holding 
> > 500
> > entries is 20 kB. An index for an extreme node holding a million entries is
> > only 40 MB, still small enough to keep in the memory, and certainly not an
> > issue on disk. Your half gigabyte index file that "will happen" contains 125
> > million entries. I would like to contend that that will _not_ happen, and 
> > that
> > is not because I think "Freenet never get that big", but for that same 
> > reason
> > you don't see that many ADT tree structures with 125 million branches per 
> > level.
> 
> The half gigabyte index file was just hypothetical.  I never said that
> it would really happen.


Umm, I hate to nitpick, but yes you did. The quote is:

"yes, this is an extreme example, but it *will* happen."

<> 
> > However, that is NOT the problem with DataStore design. Looking up data in a
> > list is probably the oldest problem in CS and we could probably find a 
> > million
> > examples of people having done it before. The hard part in the DataStore is 
> > the
> > routing, being able to find the "closest" key and it's reference, as well as
> > the nth-closest should that fail. This is not a case where I am aware of 
> > much
> > prior art (I've wondered a bit about spell checkers, if anybody knows 
> > anything
> > else I would be very interested). 
<> 
> As for closeness, is the purpose of this to choose which node to send
> a request which could not be served locally to?  I looked in the
> source code, and it appears that "closeness" is literally how close
> the keys are to each other, not some algorithm to determine which
> nodes are most likely to have the data available on them.  Why not
> just keep data on nodes and their likelyhood of having data available
> on or through them, and try use nodes in descending order of
> reliability?  That is what I was planning to do with nfreenetd.  Of
> course, such a node choosing algorithm would result in nodes with
> large datastores and many connections to other nodes getting much
> heavier use that other nodes.

Do you have a clue what you are trying implement? Have you even read Theodore's
or Ian's original paper?

> 
> -- 
> Travis Bemann
> Sendmail is still screwed up on my box.
> My email address is really bemann at execpc.com.

----------------------------------------
Content-Type: application/pgp-signature; name="unnamed"
Content-Transfer-Encoding: 7bit
Content-Description: 
----------------------------------------

-- 
\oskar

_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

[Freenet-dev] Datastore formats and scalability

Reply via email to