Re: reversing node removal?

Luke Bakken Tue, 15 Apr 2014 09:42:12 -0700

Hi Allen -

Failure / node leave situations should be taken into account during cluster
capacity planning. I've created an issue to more thoroughly explain this in
our documentation:


https://github.com/basho/basho_docs/issues/1034

--
Luke Bakken
CSE
[email protected]


On Tue, Apr 15, 2014 at 9:28 AM, Allen Landsidel
<[email protected]>wrote:

> Luke,
>
> I already do use nagios for that, but the disk space was fine before I
> told one of the nodes to leave the cluster.  That's my problem -- there was
> not enough free space in the cluster for it to move all that nodes data.
>  It accepted the leave and then ran me out of disk space on all the other
> nodes, with no way to abort or recover.
>
> My only option was to add more space to the other nodes (as you said,
> adding new nodes will not work until the leave is done), which is easy
> enough in a virtualized environment but requires downtime.  In a bare metal
> environment, it could be catastrophic to the cluster.
>
>
> On 4/15/2014 12:19, Luke Bakken wrote:
>
>> Hi Allen,
>>
>> Cluster leave does not check for disk space and in general, Riak is not
>> aware of how much space it has available to itself (most db systems
>> don't monitor disk space I think). I'll send a note to product
>> management about this. We recommend using a monitoring solution (like
>> collectd + graphite) to keep an eye on available disk space.
>>
>>
>> --
>> Luke Bakken
>> CSE
>> [email protected] <mailto:[email protected]>
>
>

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: reversing node removal?

Reply via email to