On Fri, Sep 02, 2005 at 09:59:21PM +0100, Matthew Toseland wrote: > Okay, what about ID collisions? We want to keep prevent collisions with > nodes which are already dependant on us, right? Does it matter? We can > collide on one ID, but not on the others... but it doesn't matter that > much. Collision detection isn't vital, because we are moving in the > right direction - towards the key. Or is it? We could well loop back on > ourselves... we would run out of HTL, but it wouldn't be very > profitable. There is a similar situation with request coalescing. I > decided this wasn't a big deal a while ago, I'm not sure I remember > why... The obvious solution is not to coalesce subscribe requests. We > can achieve this while still using max HTL as above. The problem is two > such requests could be running at the same time. The answer to that is > to do the determination of which nodes need to be told that we are no > longer subscribed through them atomically at the end of the request.
The problem with *that* is that different requests have been routed to us through different chains... Some of which will be rather problematic now. So our routing could vary quite dramatically from moment to moment... > > Ideally we would coalesce them though, especially as they're all at the > same HTL. We may need to make coalescing (of requests and inserts as > well as subscribe requests) slightly less transparent... we > can hit our tail if we're not careful... we could send a Coalesced > message back, indicating we were coalesced with a given ID, and to > regard that ID as equal to the original ID, so to reject it if it loops > back... this could then be sent to the other nodes on our request > chain... > > On Fri, Sep 02, 2005 at 09:09:53PM +0100, Matthew Toseland wrote: > > Actually, the mechanism proposed on routing through a node we're already > > connected indirectly to WON'T work, because there's no clear definition > > of dependancy. There is no clear definition of "the node from which I > > expect to receive packets". Thus if a node has to reroute, it will end > > up sending a SubscribeRestarted to the entire network. > > > > Now, can we make it into a tree? > > > > The data probably won't be inserted at the root node. We could route it > > up to the root node though. > > > > The other problem is backtracking. The simple solution to this is to > > make the RNF loops we get dependant on us, and dependant on our final > > node. This is roughly analogous to how we deal with requests. Hmmm... > > > > On Fri, Sep 02, 2005 at 09:00:42PM +0100, Matthew Toseland wrote: > > > Here's another attempt at a publish/subscribe architecture, after some > > > discussion with Ian, this should be simpler, persist even when the > > > source node goes down, and make a start on resource limiting: > > > > > > TOPOLOGY > > > -------- > > > > > > Firstly, all the below will probably work much better with the new > > > backtracking mechanism. > > > > > > There is no fixed path from the data source to the subscribe tree. > > > Packets are routed individually, and we make no attempt to track how > > > they were routed last time, or to keep nodes on the right path. > > > > > > There is a single subscribe tree. Actually, it's a subscribe graph. The > > > reason it is a graph is backtracking. If I route a request with HTL 20 > > > to a clump of 10 nodes, it will return with an RNF with HTL 10. I will > > > then route it elsewhere. Thus I will have two successful routes - the > > > data will have to be sent to both of them if I get a packet. Now, the > > > subscribe graph is the result of very many subscriptions. Each is routed > > > towards the key, and together they make a graph, which is somewhat > > > treeish, but is not a tree. It therefore has no fully defined root. But > > > data will certainly spread very quickly through it. > > > > > > Then, when we route a publish packet, it will also be routed towards the > > > tree. In all likelihood it will eventually meet a node which is on the > > > graph. The data will then be broadcast across the graph. Each node > > > verifies it, checks that it hasn't already got it, and relays it to all > > > directly connected nodes. Because it is a graph and not a strict tree, > > > it is possible that the same packet would be sent to the same node twice > > > by two different nodes. This will waste some bandwidth. However, it > > > probably won't waste a LOT of bandwidth, because mostly the node will be > > > told directly before it is told indirectly. > > > > > > So we have a persistent subscribe graph, and we have transient > > > PublishData's - equivalent to inserts. Actually this looks a lot like > > > passive requests when you think about it. The difference being that we > > > know the keys are connected, so you don't have to constantly send out > > > requests to subscribe to the next key in the sequence. And completely > > > different caching policy. > > > > > > Each node will have a maximum of 32 streams. If we run out of streams, > > > we reject new SubscribeRequests. So it is important that we not lose > > > track of the streams. When a node is restarted, it should tell its > > > neighbours which streams it is subscribed to, and to disconnect from any > > > others. This will require a special message, BulkSubscribeStreams, or > > > something... > > > > > > If a node no longer has any subscribers for a given stream, it drops the > > > stream. This might cause a cascade effect. But once one node has > > > subscribed to a stream, there will always be at least a chain of nodes, > > > moving outward from the start, to which the data can be sent. > > > > > > These arbitrary limits suggest a DoS may be possible; we will have to > > > review this part of the architecture when we have some more experience. > > > LRU is an option, but would mean that streams which have infrequent > > > packets die, regardless of demand. That would be bad. > > > > > > Now, here's the hard part: Error handling. > > > Error handling on publish is easy. The packets just get routed > > > differently. They end up in the same place, and they still rendezvous > > > with the subscribe graph. > > > > > > BUT: Error handling on subscribe is, as always, complicated. > > > > > > I adapt the previous proposal: > > > > > > Subscriptions will normally be routed similarly to requests, and will > > > terminate on a collision i.e. they will terminate successfully when they > > > reach a node which has already got the stream. They can RNF and will be > > > rerouted in that case. Or they can run out of HTL, in which case they > > > are regarded as successful, although it is possible that they have run > > > into a network dead-end. > > > > > > Once we are subscribed, we can have various errors: > > > - We disconnect from a node we were subscribed through. > > > We will have to reroute. > > > - The locations of our neighbour nodes change so much that the route > > > would have changed. > > > We will have to reroute. > > > - A node downstream reroutes for one of the above reasons. > > > We will have to reroute. > > > > > > Further complication: If an upstream has to reroute, it may have to > > > reroute through us. Therefore somehow it must tell us that it is about > > > to do so so that we pass the request on rather than accepting it. > > > > > > Simplest option first: > > > We disconnect from a node we were subscribed through. > > > > > > If this was the last or only node we were subscribed through (meaning we > > > successfully sent a subscribe request through), it is relatively simple: > > > We check what HTL we had left at that point, and rerun the request from > > > that point. > > > > > > If the node which has broken was NOT the last node, things are more > > > complex. We know what HTL we had left at that point. We will probably > > > have to reroute from that point. Assuming the locations have not > > > changed: > > > - If the next node returned an RNF, we can use that as the next block - > > > just starting at a different HTL; taking the same number of hops. Then > > > we can route again, to a further node; this is merely adding to the > > > path so is not a problem. > > > - If the next node returned success due to running out of hops, (this > > > needs to be a distinct mode), we will definitely need to rerequest > > > through it. > > > > > > If a node downstream loses a connection, it will have to reroute. It may > > > then be that its RNF is no longer valid, and it has to give us one with > > > a different HTL. We can then act on this... > > > > > > If the locations HAVE changed... things are even worse. If we discover > > > that we should have routed to B then C, not C then B, and they both > > > RNFed, this is not a problem. But if the latter succeeded due to running > > > out of hops, this is a problem. > > > > > > What about HTLs? If we get a subscription with HTL 20, when we had > > > routed at HTL 17...? If it is less than our HTL, no problem, but if it > > > is more: > > > - If each node on the chain RNFed, we don't need to do anything, because > > > the HTL was not the limiting factor; the size of the network was. > > > - Otherwise we may need to do some rerouting... > > > > > > > > > Also, if a node upstream has to reroute, it may end up rerouting through > > > us, and we would accept it because we have a stream already... the > > > mechanism described in the previous mail is adequate for dealing with > > > this though. > > > > > > > > > Possible general principles: > > > - Live requests: When we get the first subscribe request, we create a > > > thread to route it. We then route it. The request succeeds, or at > > > least, we run out of nodes. We then wait for: > > > a) A connection to a node to go down > > > b) A connection to a (possibly new) node to come up > > > c) Any connected node we routed through to change location > > > d) Any node we routed through to return a message such as a new RNF, > > > etc. > > > If this happens, we may have to reroute. > > > > > > This is exactly the same challenge we faced with the other side of the > > > equasion (the data publish blocks) in the first version of the draft. > > > The solution was to route them individually and keep nodes updated as to > > > whether they were involved if we routed through them on one packet but > > > not on the next. We can't do this exactly for subscriptions however. > > > What is the answer? > > > -- > > > Matthew J Toseland - toad at amphibian.dyndns.org > > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > > > > > _______________________________________________ > > > Tech mailing list > > > Tech at freenetproject.org > > > http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/tech > > > > -- > > Matthew J Toseland - toad at amphibian.dyndns.org > > Freenet Project Official Codemonkey - http://freenetproject.org/ > > ICTHUS - Nothing is impossible. Our Boss says so. > > > > > _______________________________________________ > > Tech mailing list > > Tech at freenetproject.org > > http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/tech > > -- > Matthew J Toseland - toad at amphibian.dyndns.org > Freenet Project Official Codemonkey - http://freenetproject.org/ > ICTHUS - Nothing is impossible. Our Boss says so. > _______________________________________________ > Tech mailing list > Tech at freenetproject.org > http://dodo.freenetproject.org/cgi-bin/mailman/listinfo/tech -- Matthew J Toseland - toad at amphibian.dyndns.org Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/tech/attachments/20050902/f5a9dddc/attachment.pgp>
