Thank you, this has been very helpful. I appreciate all of the information and the quick responses.
- Jeff On Mar 24, 2014, at 2:20 PM, Seth Thomas <[email protected]> wrote: > To further elaborate: > > When a nodes fails (or is simply too slow) for a PUT request, that data will > be placed on the first fallback in the preflist as per [1]. For a GET the > request only needs to full fill the R[2] value which for an N of 3, is 2. So > the GET request would succeed by simply taking the response from the other > two active primaries and then read-repair[3] the value to the fallback. > > Hopefully that makes it a little bit more clear the moving pieces in a > failure scenario. > > [1] > http://docs.basho.com/riak/latest/theory/concepts/Replication/#Processing-partition-requests > [2] > http://docs.basho.com/riak/latest/theory/concepts/Eventual-Consistency/#Replication-properties-and-request-tuning > [3] http://docs.basho.com/riak/latest/theory/concepts/Replication/#Read-Repair > > > > On March 24, 2014 at 10:57:02, Jeff Peck ([email protected]) wrote: > >> Aha, so if a node is detected to go down, but the cluster is not currently >> receiving any requests (i.e. on a development cluster that is used to store >> data that is only requested periodically or in batches, etc.) then there >> would not be any increased I/O unless it is manually ("administratively") >> removed? >> >> - Jeff >> >> >> >> On Mar 24, 2014, at 1:53 PM, Seth Thomas <[email protected]> wrote: >> >>> As immediately as the cluster can detect the node is no longer serving >>> requests[1]. There will likely be increased network and IO among the >>> remaining nodes as they will be picking up the slack. That said, the data >>> is not permanently reshuffled at that point - only such time as it is >>> administratively removed. The degree to which you’d see a spike depends on >>> the volume of objects, # of physical nodes, and # of partitions/vnodes. >>> >>> [1] >>> http://docs.basho.com/riak/latest/theory/concepts/Replication/#Processing-partition-requests >>> >>> On March 24, 2014 at 10:43:36, Jeff Peck ([email protected]) wrote: >>> >>>> Does that happen immediately? I am basically trying to understand: When a >>>> physical node goes down (let's say it is temporarily restarted, or down >>>> for even a couple hours due to some sort of failure), will that cause an >>>> increase in disk and network bandwidth at the moment that it goes down as >>>> data is re-shuffled across the cluster? >>>> >>>> Thanks, >>>> Jeff >>>> >>>> >>>> On Mar 24, 2014, at 1:38 PM, Seth Thomas <[email protected]> wrote: >>>> >>>>> Data is redistributed temporarily (indefinitely) until the primary node >>>>> comes back online. So primary ownership of data would not be changed but >>>>> your keys could be living on another physical node if any of the primary >>>>> replicas were down. >>>>> >>>>> So to answer your question directly: Yes (in the narrowest definition) >>>>> >>>>> >>>>> On March 24, 2014 at 10:34:22, Jeff Peck ([email protected]) wrote: >>>>> >>>>>> Thank you. So, does that mean that no redistribution of data would occur >>>>>> unless the node is manually removed? >>>>>> >>>>>> - Jeff >>>>>> >>>>>> >>>>>> On Mar 24, 2014, at 1:31 PM, Seth Thomas <[email protected]> wrote: >>>>>> >>>>>>> Jeff, >>>>>>> >>>>>>> When a node is no longer responding a process called hinted handoff[1] >>>>>>> takes over and ensure that your N (replication) value is met by >>>>>>> allowing other nodes to temporarily take responsibility for the vnodes >>>>>>> of the downed node. This node can return to the cluster and will resume >>>>>>> operations for the vnodes it’s primarily responsible for or you could >>>>>>> remove the node[2] from the cluster which would redistribute the >>>>>>> primary responsibly among the remaining nodes. I’d also give our docs >>>>>>> on replication[3] a look for more information. >>>>>>> >>>>>>> Seth Thomas >>>>>>> >>>>>>> [1] >>>>>>> http://docs.basho.com/riak/latest/theory/concepts/glossary/#Hinted-Handoff >>>>>>> [2] >>>>>>> http://docs.basho.com/riak/latest/ops/running/nodes/adding-removing/#Removing-a-Node-From-a-Cluster >>>>>>> [3] http://docs.basho.com/riak/latest/theory/concepts/Replication/ >>>>>>> >>>>>>> >>>>>>> On March 24, 2014 at 9:33:17, Jeff Peck ([email protected]) wrote: >>>>>>> >>>>>>>> Is there a description of what happens internally when a node goes >>>>>>>> down? I am curious if any there would be any sort of reshuffling or >>>>>>>> redistribution of data in the remaining vnodes? Or would the node >>>>>>>> simply be unavailable until restarted? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jeff >>>>>>>> _______________________________________________ >>>>>>>> riak-users mailing list >>>>>>>> [email protected] >>>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
