[Freenet-dev] Updatable data

Oskar Sandberg Wed, 10 May 2000 01:49:53 +0200

I'm a little drunk, so forgive me as I slurr.

On Tue, 09 May 2000, Ian Clarke wrote:
> [reader's note: I have a proposal taking into account some of the
> concerns which Oskar raises, please scroll down to the bottom (past my
> bickering with LDC) to read it
> 
> > I had to go back and look up this thread because it was mistitled,
> > but I don't buy your argument at all.  Contrary to your assertion,
> > Oskar did in fact address your points directly, and convincingly.
> 
> Yes, I have just read his answer, but it is your opinion that he
> addressed my points convincingly, not mine!


But in your reply to my post you cut off everything except two lines. You take
up some more below, but you still haven't addressed my main defense for why it
works, namely that the propogation of the update is exactly like the propogation
of the newly inserted data.

> > If others didn't respond, maybe it was because we know that Oskar
> > has more experience and a reputation for being right.
> 
> Er, more experience than who?  I started this project in September 1998,
> 9 months before anyone else did - it is hard to get more experience than
> that!  Yes Oskar is frequently right, but he is occasionally wrong too. 
> This is true of most of the core developers.  I prefer to base my
> opinions on facts rather than reputation.  

In his own opinion, Oskar is never wrong about anything. He may very well be
wrong about that though :-))).

> As I will outline towards the end of this email, I think Oskar does make
> some good points, but I think his proposal avoids the tough issues which
> my proposal tackled head-on making it vulnerable to attack (remind
> anyone of the KHK/CHK debate? ;)  Oskar's suggestion about sending the
> update to 10-15 nodes is dubious at best, where do we get these nodes
> from?  The datastore?  But the DataStore is likely to be filled with
> nodes that are already in our part of information space, so they will
> probably just take the same route to the "epi-center" of the data
> anyway.

a) I have never avoided any of your arguments. Nobody is perfect, but I believe
myself to be above simply avoiding objections because I cannot answer them.
What I want is for Freenet to work as well as possible, if my ideas are flawed
(and like you noted, I have of course had flawed ideas, though I try to filter
most myself before posting) then out they go. 

If anything, what reminds me of the CHK debating is that I answer and answer
and answer to the best of my ability, and you still accuse me of avoiding to
answer. I truly don't know what to do to satisfy you in this regard.

b) I think you misunderstand me. The Update gets send to 10-15 nodes because it
is send just like a normal InsertRequest with HTL (I guess) of around 10-15. If
no follow-through request for the data can reach it when it has gone this far,
then how do requests find newly inserted data, which also traveled 10-15 nodes
using standard Request routing?

> >  I'm not
> > entirely convinced that the idea will work well
> 
> Haven't you just contradicted your earlier assertion that Oskar is
> always right? ;-)

I'm far from convinced that Freenet will work at all. I like to say that the
reason I am here is that I am not convinced that it won't. Nor am I convinced
that CHKs, SVKs, follow-throughs, or any of the other ideas I have campaigned
for will work - I just find them more likely then the alternatives.

Being convinced of anything (expect one's values and ideals) is a great falacy.

<snip> 
> The problem with your's/Oskar's proposal is that I don't see how the
> updates will be propogated to a sufficient number of nodes so that even
> where requests can "slip through" nodes where they would normally be
> answered, they eventually get to the updated version of the data.  In
> fact, unless I am missing something, this isn't really addressed by
> Oskar's proposal at all!  This is the issue that my idea addresses, and
> moreover, I can't think of any more sensible way to do it.  Your idea is
> not without merit, however, read on and I will outline a compromise.

Say I make some new data using a normal standard KHK scheme. It just so happens
that the hash of the keyword is very close to another key on Freenet, they share
the first 10 digits of 40 digits, and so references for that data will always be
closer to the new key then any other references. 

Now I send an insert. The insert for my data will be routed exactly like a
request for the key that hashed so close, only because they are not identical
(just closer then anything else) it will not be stopped when it actually finds
that data. 

My insert of course contained brilliant data. So it becomes famous. Everybody
rushes to request my data. Requests for my data gets routed exactly like
requests for the data with close key, except since they are not identical, it
will not stop when it finds the data for that other key. Will my new data be
found by the people Requesting it?

If the answer is no, then Freenet is in trouble, because all data will have
some other document that is closer to it then any other.

If the answer is yes, then the follow-through requests will find the updated
data, because they route with respect to the key exactly like requests for this
new data in my example would with respect the very close key.

> >  Secondly, you assert
> > that this will interfere with the existing caching mechanism.
> > This is so far from the truth I can't imagine how you can say it
> > with a straight face. 
> 
> You would be amazed by what I can say with a straight face.  The caching
> mechanism depends on caches actually answering requests when they
> receive a request which is locally cached - if this doesn't happen, then
> the caching mechanism will break down.

But it does answer the Request, it just performs a very light "make sure there
is no newer data within reach" operation on certain requests. Yes this leads to
more messages, and yes I am not entirely happy about that, but most of the
important aspect of caching, that the DATA does not get send further then it
has to (this is of course not just a bandwidth issue, but also a DataStore
issue), is still preserved. 

<snip>
> > I also agree with Oskar that updatable documents ought to be small
> > to further minimize any possible impact of these extended fetches
> > and inserts; making them all empty redirects to CHK documents is a
> > great way to do that, so we can even dispense with message bodies
> > entirely if we want to.
> 
> I agree here too - once we have implemented CHKs!

Even if you put the data in the the SVK indexed data, it still uses everything
that CHKs use. Since CHKs are simpler, I would very much recommend we implement
them first to get a feel for the issues.

> Ok, here is my proposal - which will hopefully placate Oskar and LDC.

Sorry.. :-)

> I still think some form of *constrained* broadcast is essential to the
> initial propogation of the update - it is not sufficient to simply
> update one node, or just the nodes in a direct line between you and
> whatever node is closest to the data in information space, a shotgun
> approach is essential here.

See above.

> The DataUpdate should be routed like a data request, until it reaches a
> node which stores the data.  The DataUpdate will continue past this
> (until it's HTL runs out) but will also spawn a "virus" version of a
> DataUpdate which will be sprayed out to surrounding nodes with a very
> low HTL.  These nodes will only forward the dataupdate if they
> themselves have the data being updated.  This prevents the huge
> explosion of messages which Oskar fears.

I still don't think this sort of "constrained explosive" routing will work
downstream. Having cached the data is simply not quivalent to has link from
epi-center. Why should it be?  

I think you get stuck at an unholy compormise between not working and causing
to many messages.

> Updatable data should also be supplied with a timeout.  This means that
> when data is updated, locally cached versions of the data are likely to
> die out allowing the requests to get through to the "core" nodes which
> should be holding the updated data.  This addresses the same problem
> which I think Oskar was worried about when he made his proposal and is
> related to his "update-by" suggestion.

The update-by field was meant to be meta-data in the content so as to give
clients an idea of when they need to do follow-throughs.

I think that having such a field in the data on the network that actually
kills off the data is horrible. Not only does it really hurt usability that you
have to know exactly when you will update, but it has major sustainability
issues. Say "Disident X" runs Freenet page about how bad Regime Y is.
Because this is updated weekly, he needs to set it to die every week in nodes
that cache it. 

But then Disident X gets caught an shot by Regime Y (it wasn't Freenet's fault,
his woman ratted him out!) While his loss is a sad thing, at least we want to
make sure that it does not mean that his famous page with info about Regime Y
disappears too!

> Howzat?

I will take the appropriate step and make the same modification to my own
proposal, but one that does not involve the pitfalls of actually killing off
data on the network.

We use a deep/follow-through request system. However, data contains a storable
field that says how long the period it during which we are sure we will NOT get
another update. During this period, even follow-through requests will terminate
on finding the data. 

It still isn't perfect since it means data updates will propogate badly if at
all during this period, but not being able to update for a while is a lot better
then the data dying if you are unable to update.

> 
> Ian.
> 
> _______________________________________________
> Freenet-dev mailing list
> Freenet-dev at lists.sourceforge.net
> http://lists.sourceforge.net/mailman/listinfo/freenet-dev
-- 

Oskar Sandberg

md98-osa at nada.kth.se

#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

_______________________________________________
Freenet-dev mailing list
Freenet-dev at lists.sourceforge.net
http://lists.sourceforge.net/mailman/listinfo/freenet-dev

[Freenet-dev] Updatable data

Reply via email to