The nodes are NOT totally isolated from each other if they know their role and who's the secondary or tertiary node. I understood that the client is the only one who knows those things.
Tapio On Sun, Feb 8, 2009 at 1:03 AM, Ayende Rahien <[email protected]> wrote: > no, you want to keep it simple > Ad the nodes have to know about each other, so they can tell their > secondary and tertiary > > > On Sun, Feb 8, 2009 at 1:00 AM, Tapio Kulmala <[email protected]>wrote: > >> Ok.... >> >> I'm still assuming that the client is the only one who knows what the >> secondary and tertiary nodes are for the primary node. >> >> If the primary node is down, could you just swap the roles of those nodes. >> Make the unavailable node tertiary and go on as usual? >> >> >> >> On Sun, Feb 8, 2009 at 12:53 AM, Ayende Rahien <[email protected]> wrote: >> >>> No, that is the problem that this is supposed to deal with.I am not >>> straight yet on the issue of how to deal with the node configuration. >>> >>> The design for the new version of Rhino DHT is simple. We continue to >>> support only three operations on the wire, Put, Get and Remove. But we also >>> introduced a new notion. Failover servers. Every node in the DHT has a >>> secondary and tertiary nodes defined to it. Those nodes are also full >>> fledged nodes in the DHT, capable of handling their own stuff. >>> >>> During normal operation, any successful Put or Remove operation will be >>> sent via async messages to the secondary and tertiary nodes. If a node goes >>> down, the client library is responsible for detecting that and moving to the >>> secondary node, and the tertiary one if that is down as well. Get is pretty >>> simple in this regard, as you can imagine, the node needs to simply serve >>> the request from local storage. Put and Remove operations are more complex, >>> the logic for doing this is the same as always, include all the conflict >>> resolution, etc. But in addition to that, the Put and Remove requests will >>> generate async messages to the primary and tertiary nodes (if using the >>> secondary as fallback, and primary and secondary if using the tertiary as >>> fallback). >>> >>> That way, when the primary come back up, it can catch up with work that >>> was done while it was down. >>> The question of how to store the data about the nodes ,however, remains. >>> I think that I'll store it as replicated value in all the nodes. So we have >>> a list of all the nodes and their secondary and tertiary there. >>> >>> On Sun, Feb 8, 2009 at 12:49 AM, Tapio Kulmala >>> <[email protected]>wrote: >>> >>>> Doesn't that algorithm always get the data for key xxx from the same >>>> node? If the node is down you'll hit the real data store but you'll never >>>> switch to another node. Or do you take the node out from your nodes list? >>>> >>>> >>>> Tapio >>>> >>>> >>>> >>>> On Sun, Feb 8, 2009 at 12:35 AM, Ayende Rahien <[email protected]>wrote: >>>> >>>>> Right now, this information is stored on the client side.The client >>>>> has a list of nodes and get/store the data in them using the following >>>>> algorithm: >>>>> >>>>> public Value Get(string key) >>>>> { >>>>> return nodes[ key.GetHashCode() % nodes.Length].Get(key) >>>>> } >>>>> >>>>> On Sun, Feb 8, 2009 at 12:19 AM, Tapio Kulmala <[email protected] >>>>> > wrote: >>>>> >>>>>> I don't know anything about Rhino DHT so this might be a really stupid >>>>>> question. You said that that each node is each node is totally isolated >>>>>> from >>>>>> all the rest. How does the client initially know what nodes exists and >>>>>> where >>>>>> the data might be stored? >>>>>> >>>>>> Tapio >>>>>> >>>>>> >>>>>> On Sat, Feb 7, 2009 at 9:17 PM, Ayende Rahien <[email protected]>wrote: >>>>>> >>>>>>> My initial design when building Rhino DHT was that it would work in >>>>>>> a similar manner to Memcached, with the addition of multi versioned >>>>>>> values >>>>>>> and persistence. That is, each node is completely isolated from all the >>>>>>> rest, and it is the client that is actually creating the illusion of >>>>>>> distributed cohesion. >>>>>>> >>>>>>> The only problem with this approach is reliability. That is, if a >>>>>>> node goes down, all the values that are stored in it are gone. This is >>>>>>> not a >>>>>>> problem for Memcached. If the node is down, all you have to do is to >>>>>>> hit the >>>>>>> actual data source. Memcached is *not *a data store, it is a cache, >>>>>>> and it is allowed to remove values when you want it. >>>>>>> >>>>>>> For Rhino DHT, that is not the case. I am using it to store the saga >>>>>>> details for Rhino Service Bus, as well as storing persistent state. >>>>>>> >>>>>>> The first plan was to use it as is. If a node is down, it would cause >>>>>>> an error during load saga state stage (try to say *that* three times >>>>>>> fast!), which would eventually move the message to the error queue, >>>>>>> when the >>>>>>> node came back up, we could move the messages from the error queue to >>>>>>> the >>>>>>> main queue and be done with it. >>>>>>> >>>>>>> My current client had some objections to that, from his perspective, >>>>>>> if any node in the DHT was down, the other nodes should take over >>>>>>> automatically, without any interruption of service. That is… somewhat >>>>>>> more >>>>>>> complex to handle. >>>>>>> >>>>>>> Well, actually, it isn't more complex to handle. I was able to >>>>>>> continue with my current path for everything (including full transparent >>>>>>> failover for reads and writes). >>>>>>> >>>>>>> What I was *not *able to solve, however, was how to handle a node >>>>>>> coming back *up*. Or, to be rather more exact, I run into a problem >>>>>>> there because the only way to solve this cleanly was to use messaging. >>>>>>> But, >>>>>>> of course, Rhino Service Bus is dependent on Rhino DHT. And creating a >>>>>>> circular reference would just make things more complex, even if it was >>>>>>> broken with interfaces in the middle. >>>>>>> Thoughts? >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> >> > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en -~----------~----~----~----~------~----~------~--~---
