I realize I made a mistake, it would just be nice if the UI could warn me that I was about to do so, especially given the consequences.

If it was simply showing me how much space each node was using (forget a percentage or anything) that would've been enough to avert disaster. With four nodes, if they're over 25% capacity (far lower than any sensible warning level in a monitoring system), the cluster leave is going to fail. The more nodes you add to the system, the lower you'd have to set that warning threshold to alert you that you're in a state where you can't safely retire a node.

On 4/15/2014 12:40, Luke Bakken wrote:
Hi Allen -

Failure / node leave situations should be taken into account during
cluster capacity planning. I've created an issue to more thoroughly
explain this in our documentation:

https://github.com/basho/basho_docs/issues/1034

--
Luke Bakken
CSE
[email protected] <mailto:[email protected]>


On Tue, Apr 15, 2014 at 9:28 AM, Allen Landsidel
<[email protected] <mailto:[email protected]>> wrote:

    Luke,

    I already do use nagios for that, but the disk space was fine before
    I told one of the nodes to leave the cluster.  That's my problem --
    there was not enough free space in the cluster for it to move all
    that nodes data.  It accepted the leave and then ran me out of disk
    space on all the other nodes, with no way to abort or recover.

    My only option was to add more space to the other nodes (as you
    said, adding new nodes will not work until the leave is done), which
    is easy enough in a virtualized environment but requires downtime.
      In a bare metal environment, it could be catastrophic to the cluster.


    On 4/15/2014 12:19, Luke Bakken wrote:

        Hi Allen,

        Cluster leave does not check for disk space and in general, Riak
        is not
        aware of how much space it has available to itself (most db systems
        don't monitor disk space I think). I'll send a note to product
        management about this. We recommend using a monitoring solution
        (like
        collectd + graphite) to keep an eye on available disk space.


        --
        Luke Bakken
        CSE
        [email protected] <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to