Hey,

Attached is a short utility meant to help maintain a mogilefs cluster.

The tool will take a list of constraints, ie:

./mogautodrain --trackers="127.0.0.1:6001" --hosts="all" --maxpercent="50" --noredrain --parallel 2

... will crawl all alive devices on all alive hosts and set devices into drain mode which are over 50% full. It will continue to drain them, two a at a time, until it has gotten each device down to 50% at least once (--noredrain). By default it will continue to mark/unmark devices to/from drain until all of the ones specified are at 50% or less at the same time. Then it will exit.

What this is useful for:

- MogileFS's rebalance doesn't work very well. Works okay but not very well.
- If you have a particular host which is too full of old, idle files, and you can to clear it off a bit so new files can be written to it.
- If you add new hosts and want to shovel a portion of the files onto them.
- If you want to drain a couple devices to zero then chain other commands afterwords (./mogautodrain [blah] && mogadm device blah blah or simply '&& mail -s "Toast's done!" blah blah'

What this is NOT useful for:

- Running all the time to "guarantee" devices are at a certain percentage free. Wrong, dangerous, and if you really need to, use the 'minfree' mogilefsd config option. - Rebalancing utilization. MFS does that itself to an extent, but this tool cannot guarantee files you move are actually active in any way.
- Other such evil.

What it's missing:

- Tracker commands need to be wrapped in a retry, since this is a long running tool.
- Mode to detect "stuck" drains and move on / blacklist.
- Dunno. List shuffle?

Why is this not part of Mogilefs::Utils (yet):

- Because rebalance should probably just be fixed.

Since that might take a while, I wouldn't be against adding it to Mogilefs::Utils. However I want to run the idea by the list first. I do like the ability to constrain operations to specific devices/hosts, which will require significant reworking of rebalance to achieve.

But I agree it's an evil utility. Enjoy.

-Dormando
#!/usr/bin/perl
# Written by dormando, to be distributed under the same terms as
# MogileFS::Utils

use strict;
use warnings FATAL => 'all';
use MogileFS::Admin;
use MogileFS::Client;
use Getopt::Long;
use POSIX qw(strftime);
use Data::Dumper qw(Dumper);

my $debug    = 0;
# On ^C set a flag to die after this cycle.
my $die_soon = 0;
$|++;

my %opts = ( interval => 30, parallel => 1, noredrain => 0 );
my $rv   = GetOptions(\%opts, 'help|h', 'interval|i=i', 'trackers=s',
                      'hosts=s', 'devices=s', 'maxpercent=s', 'minfree=i',
                      'fulldrain', 'parallel=i', 'noredrain');

if ($opts{help}) {
    show_help(); exit;
}

# FIXME: Finish help. Perldocs?
sub show_help {
    print "Usage: $0 --trackers='127.0.0.1:6001' --hosts='sto1' 
--maxpercent='90'\n";
    print qq{Options:
    --help          display this help
    --trackers      list of mogile trackers ex: "ip:port,ip:port"
    --hosts         only drain these hosts ex: "sto1,sto2" or "all"
    --devices       only drain these devices ex: "1,2,5,33,17"
    --maxpercent    ensure devices are at most this % full ex: "80.5"
    --minfree       minimum free space (MB) devices should have
    --fulldrain     drain all specified devices to 0%
    --parallel      max devices to drain in parallel ex: "3"
    --noredrain     only drain each device once during run
    --interval      time in seconds between drain checks and log output
};
}

$SIG{INT} = \&sigint_handler;

# Split up the tracker opt into an array.
if ($opts{trackers}) {
    $opts{trackers} = [ split(/\s*,\s*/, $opts{trackers}) ];
} else {
    error_die('You must supply a tracker, like: --trackers="127.0.0.1:6001"');
}

my $madm = MogileFS::Admin->new(hosts => $opts{trackers});
die "Could not connect to a tracker." unless $madm;

error_die('Must specify --maxpercent or --minfree or --fulldrain') 
        unless ($opts{maxpercent} || $opts{minfree} || $opts{fulldrain});

error_die('Must specify at least --hosts="all" or --devices="all"')
        unless ($opts{hosts} || $opts{devices});

# Figure the requested constraints on full-ness.

# --fulldrain overrides --maxpercent to '0'... Note this might not always
# work. So we should add an extra check under --fulldrain to move onto the
# next device if it's gotten stuck.
$opts{maxpercent} = 0 if $opts{fulldrain};

error_die("We're too stupid to handle maxpercent AND minfree at the same time")
        if ($opts{maxpercent} && $opts{minfree});
error_die('maxpercent should look like --maxpercent="87.3" or similar')
        if ($opts{maxpercent} && $opts{maxpercent} !~ m/^\d+(\.\d+)?$/);
error_die('Set minfree to whatever number you want, but not negative or zero.')
        if ($opts{minfree} && $opts{minfree} < 1);
error_die('Parallel devices to drain must be at least 1')
        if ($opts{parallel} < 1);

# Set some handy defaults to make the filtering stage simpler.
$opts{maxpercent} = $opts{maxpercent} / 100 if $opts{maxpercent};
$opts{maxpercent} = 1 unless $opts{maxpercent};
$opts{minfree}    = 0 unless $opts{minfree};

# We've sufficiently babysat our user for the constraints. Now we need to
# discover what devices to drain.

print "HOSTS\n" if $debug;
my $hosts_to_drain = find_hosts_to_drain();
print Dumper($hosts_to_drain) if $debug;

print "\n\n\nDEVICES\n" if $debug;
my $devs_to_drain = find_devices_to_drain($hosts_to_drain);
print Dumper($devs_to_drain) if $debug;

print "Found ", scalar @$devs_to_drain, " devices to drain...\n";

# $in_drain is a tally of the in-flight drains.
my $in_drain = 0;

# Do a pre-loop to find any currently _in_ drain...
for my $dev (@$devs_to_drain) {
    $in_drain++ if ($dev->{status} eq 'drain');
}

# Now loop until all devices to filter match constraints.
my %no_redrain = ();
my %host_map     = map { $_->{hostid} => $_->{hostname} } @$hosts_to_drain;
while (1) {
    my %latest_devs = map { $_->{devid} => $_ } @{$madm->get_devices()};
    reset_drains_and_exit(\%latest_devs) if $die_soon;

    for my $dev (@$devs_to_drain) {
        my $latest = $latest_devs{$dev->{devid}};
        my $date   = strftime "%Y/%m/%d-%H:%M:%S%z(%Z)", localtime;

        if (should_drain($latest)) {
            if ($latest->{status} eq 'alive' && $in_drain < $opts{parallel}) {
                next if $opts{noredrain} && $no_redrain{$dev->{devid}};
                print "$date marking drain for ", dev_status($latest), "\n";
                $madm->change_device_state($host_map{$latest->{hostid}},
                                           $latest->{devid}, 'drain');
                die "Failed marking device into drain: " . $madm->errstr if
                    $madm->err;
                $in_drain++;
                next;
            } elsif ($latest->{status} eq 'drain') {
                print "$date continuing drain  ", dev_status($latest), "\n";
            }
        } elsif ($latest->{status} eq 'drain') {
            print "$date marking alive for ", dev_status($latest), "\n";
            $no_redrain{$dev->{devid}}++ if $opts{noredrain};
            $madm->change_device_state($host_map{$latest->{hostid}},
                                       $latest->{devid}, 'alive');
            die "Failed marking device alive: " . $madm->errstr if $madm->err;
            $in_drain--;
            next;
        }
    }

    # Bail if we have no devices in drain anymore.
    last unless $in_drain;

    sleep $opts{interval};
}

sub reset_drains_and_exit {
    my $devs = shift;
    print "We were told to give up. Resetting devices to alive state...\n";
    for my $dev (values %$devs) {
        next unless ($dev->{status} eq 'drain');
        $madm->change_device_state($host_map{$dev->{hostid}}, $dev->{devid},
                                   'alive');
        die "Failed marking device alive: " . $madm->errstr if $madm->err;
        print "Marked devid ", $dev->{devid}, " alive again.\n";
    }
    print "Done, exiting\n";
    exit;
}

sub dev_status {
    my $d = shift;
    my $pct  = $d->{mb_used} / $d->{mb_total} * 100;
    my $free = $d->{mb_total} - $d->{mb_used};
    $free /= 1024;
    return sprintf("[%3d] used: %3.2f%% free (G): %10.3f", 
                    $d->{devid}, $pct, $free);
}

sub should_drain {
    my $d = shift;
    return 1 if ( ($opts{maxpercent} < $d->{mb_used} / $d->{mb_total} ||
                  $opts{minfree}    > $d->{mb_total} - $d->{mb_used}) );
}

# Returns filtered list of devices to drain. Skip dead/down, or if not on a
# valid host.
# Device must be:
# - 'alive'
# - on a valid host
# - be more than %opts{maxpercent} full
# - have less than $opts{minfree} mb free
sub find_devices_to_drain {
    my $h = shift;
    my %hosts = map { $_->{hostid} => 1 } @$h;
    my $to_filter = [ grep { $_->{status} =~ m/^(alive|drain)$/ &&
                             $hosts{$_->{hostid}}    &&
                             should_drain($_)
                           } @{$madm->get_devices()} ];
    return $to_filter if (!$opts{devices} || $opts{devices} eq 'all');
    my %devs = map { $_ => 1 } split(/\s*,\s*/, $opts{devices});
    return [ grep { $devs{$_->{devid}} } @$to_filter ];
}

# Return filtered list of hosts to drain. Skip dead/down.
sub find_hosts_to_drain {
    my $to_filter = [ grep { $_->{status} eq 'alive' } 
                      @{$madm->get_hosts()} ];
    return $to_filter if (!$opts{hosts} || $opts{hosts} eq 'all');
    my %hosts = map { $_ => 1 } split(/\s*,\s*/, $opts{hosts});
    return [ grep { $hosts{$_->{hostname}} } @$to_filter ];
}

sub sigint_handler {
    print "Told to die. Wait one more interval so we can stop the drains.\n";
    print "Or hit ^C again to kill us anyway.\n";
    $die_soon++;
}

sub error_die {
    print "ERROR: ", $_[0], "\n\n";
    show_help();
    die;
}

Reply via email to