Re: What happens when a node goes down?

Jeff Peck Mon, 24 Mar 2014 10:58:38 -0700

Aha, so if a node is detected to go down, but the cluster is not currently 
receiving any requests (i.e. on a  development cluster that is used to store 
data that is only requested periodically or in batches, etc.) then there would 
not be any increased I/O unless it is manually ("administratively") removed?


- Jeff



On Mar 24, 2014, at 1:53 PM, Seth Thomas <[email protected]> wrote:

> As immediately as the cluster can detect the node is no longer serving 
> requests[1]. There will likely be increased network and IO among the 
> remaining nodes as they will be picking up the slack. That said, the data is 
> not permanently reshuffled at that point - only such time as it is 
> administratively removed. The degree to which you’d see a spike depends on 
> the volume of objects, # of physical nodes, and # of partitions/vnodes.
> 
> [1] 
> http://docs.basho.com/riak/latest/theory/concepts/Replication/#Processing-partition-requests
> 
> On March 24, 2014 at 10:43:36, Jeff Peck ([email protected]) wrote:
> 
>> Does that happen immediately? I am basically trying to understand: When a 
>> physical node goes down (let's say it is temporarily restarted, or down for 
>> even a couple hours due to some sort of failure), will that cause an 
>> increase in disk and network bandwidth at the moment that it goes down as 
>> data is re-shuffled across the cluster?
>> 
>> Thanks,
>> Jeff
>> 
>> 
>> On Mar 24, 2014, at 1:38 PM, Seth Thomas <[email protected]> wrote:
>> 
>>> Data is redistributed temporarily (indefinitely) until the primary node 
>>> comes back online. So primary ownership of data would not be changed but 
>>> your keys could be living on another physical node if any of the primary 
>>> replicas were down.
>>> 
>>> So to answer your question directly: Yes (in the narrowest definition)
>>> 
>>> 
>>> On March 24, 2014 at 10:34:22, Jeff Peck ([email protected]) wrote:
>>> 
>>>> Thank you. So, does that mean that no redistribution of data would occur 
>>>> unless the node is manually removed?
>>>> 
>>>> - Jeff
>>>> 
>>>> 
>>>> On Mar 24, 2014, at 1:31 PM, Seth Thomas <[email protected]> wrote:
>>>> 
>>>>> Jeff,
>>>>> 
>>>>> When a node is no longer responding a process called hinted handoff[1] 
>>>>> takes over and ensure that your N (replication) value is met by allowing 
>>>>> other nodes to temporarily take responsibility for the vnodes of the 
>>>>> downed node. This node can return to the cluster and will resume 
>>>>> operations for the vnodes it’s primarily responsible for or you could 
>>>>> remove the node[2] from the cluster which would redistribute the primary 
>>>>> responsibly among the remaining nodes. I’d also give our docs on 
>>>>> replication[3] a look for more information.
>>>>> 
>>>>> Seth Thomas
>>>>> 
>>>>> [1] 
>>>>> http://docs.basho.com/riak/latest/theory/concepts/glossary/#Hinted-Handoff
>>>>> [2] 
>>>>> http://docs.basho.com/riak/latest/ops/running/nodes/adding-removing/#Removing-a-Node-From-a-Cluster
>>>>> [3] http://docs.basho.com/riak/latest/theory/concepts/Replication/
>>>>> 
>>>>> 
>>>>> On March 24, 2014 at 9:33:17, Jeff Peck ([email protected]) wrote:
>>>>> 
>>>>>> Is there a description of what happens internally when a node goes down? 
>>>>>> I am curious if any there would be any sort of reshuffling or 
>>>>>> redistribution of data in the remaining vnodes? Or would the node simply 
>>>>>> be unavailable until restarted? 
>>>>>> 
>>>>>> Thanks, 
>>>>>> Jeff 
>>>>>> _______________________________________________ 
>>>>>> riak-users mailing list 
>>>>>> [email protected] 
>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: What happens when a node goes down?

Reply via email to