Aha, so if a node is detected to go down, but the cluster is not currently
receiving any requests (i.e. on a development cluster that is used to store
data that is only requested periodically or in batches, etc.) then there would
not be any increased I/O unless it is manually ("administratively") removed?
- Jeff
On Mar 24, 2014, at 1:53 PM, Seth Thomas <[email protected]> wrote:
> As immediately as the cluster can detect the node is no longer serving
> requests[1]. There will likely be increased network and IO among the
> remaining nodes as they will be picking up the slack. That said, the data is
> not permanently reshuffled at that point - only such time as it is
> administratively removed. The degree to which you’d see a spike depends on
> the volume of objects, # of physical nodes, and # of partitions/vnodes.
>
> [1]
> http://docs.basho.com/riak/latest/theory/concepts/Replication/#Processing-partition-requests
>
> On March 24, 2014 at 10:43:36, Jeff Peck ([email protected]) wrote:
>
>> Does that happen immediately? I am basically trying to understand: When a
>> physical node goes down (let's say it is temporarily restarted, or down for
>> even a couple hours due to some sort of failure), will that cause an
>> increase in disk and network bandwidth at the moment that it goes down as
>> data is re-shuffled across the cluster?
>>
>> Thanks,
>> Jeff
>>
>>
>> On Mar 24, 2014, at 1:38 PM, Seth Thomas <[email protected]> wrote:
>>
>>> Data is redistributed temporarily (indefinitely) until the primary node
>>> comes back online. So primary ownership of data would not be changed but
>>> your keys could be living on another physical node if any of the primary
>>> replicas were down.
>>>
>>> So to answer your question directly: Yes (in the narrowest definition)
>>>
>>>
>>> On March 24, 2014 at 10:34:22, Jeff Peck ([email protected]) wrote:
>>>
>>>> Thank you. So, does that mean that no redistribution of data would occur
>>>> unless the node is manually removed?
>>>>
>>>> - Jeff
>>>>
>>>>
>>>> On Mar 24, 2014, at 1:31 PM, Seth Thomas <[email protected]> wrote:
>>>>
>>>>> Jeff,
>>>>>
>>>>> When a node is no longer responding a process called hinted handoff[1]
>>>>> takes over and ensure that your N (replication) value is met by allowing
>>>>> other nodes to temporarily take responsibility for the vnodes of the
>>>>> downed node. This node can return to the cluster and will resume
>>>>> operations for the vnodes it’s primarily responsible for or you could
>>>>> remove the node[2] from the cluster which would redistribute the primary
>>>>> responsibly among the remaining nodes. I’d also give our docs on
>>>>> replication[3] a look for more information.
>>>>>
>>>>> Seth Thomas
>>>>>
>>>>> [1]
>>>>> http://docs.basho.com/riak/latest/theory/concepts/glossary/#Hinted-Handoff
>>>>> [2]
>>>>> http://docs.basho.com/riak/latest/ops/running/nodes/adding-removing/#Removing-a-Node-From-a-Cluster
>>>>> [3] http://docs.basho.com/riak/latest/theory/concepts/Replication/
>>>>>
>>>>>
>>>>> On March 24, 2014 at 9:33:17, Jeff Peck ([email protected]) wrote:
>>>>>
>>>>>> Is there a description of what happens internally when a node goes down?
>>>>>> I am curious if any there would be any sort of reshuffling or
>>>>>> redistribution of data in the remaining vnodes? Or would the node simply
>>>>>> be unavailable until restarted?
>>>>>>
>>>>>> Thanks,
>>>>>> Jeff
>>>>>> _______________________________________________
>>>>>> riak-users mailing list
>>>>>> [email protected]
>>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com