Hey,
Attached is a short utility meant to help maintain a mogilefs cluster.
The tool will take a list of constraints, ie:
./mogautodrain --trackers="127.0.0.1:6001" --hosts="all"
--maxpercent="50" --noredrain --parallel 2
... will crawl all alive devices on all alive hosts and set devices into
drain mode which are over 50% full. It will continue to drain them, two
a at a time, until it has gotten each device down to 50% at least once
(--noredrain). By default it will continue to mark/unmark devices
to/from drain until all of the ones specified are at 50% or less at the
same time. Then it will exit.
What this is useful for:
- MogileFS's rebalance doesn't work very well. Works okay but not very well.
- If you have a particular host which is too full of old, idle files,
and you can to clear it off a bit so new files can be written to it.
- If you add new hosts and want to shovel a portion of the files onto them.
- If you want to drain a couple devices to zero then chain other
commands afterwords (./mogautodrain [blah] && mogadm device blah blah or
simply '&& mail -s "Toast's done!" blah blah'
What this is NOT useful for:
- Running all the time to "guarantee" devices are at a certain
percentage free. Wrong, dangerous, and if you really need to, use the
'minfree' mogilefsd config option.
- Rebalancing utilization. MFS does that itself to an extent, but this
tool cannot guarantee files you move are actually active in any way.
- Other such evil.
What it's missing:
- Tracker commands need to be wrapped in a retry, since this is a long
running tool.
- Mode to detect "stuck" drains and move on / blacklist.
- Dunno. List shuffle?
Why is this not part of Mogilefs::Utils (yet):
- Because rebalance should probably just be fixed.
Since that might take a while, I wouldn't be against adding it to
Mogilefs::Utils. However I want to run the idea by the list first.
I do like the ability to constrain operations to specific devices/hosts,
which will require significant reworking of rebalance to achieve.
But I agree it's an evil utility. Enjoy.
-Dormando
#!/usr/bin/perl
# Written by dormando, to be distributed under the same terms as
# MogileFS::Utils
use strict;
use warnings FATAL => 'all';
use MogileFS::Admin;
use MogileFS::Client;
use Getopt::Long;
use POSIX qw(strftime);
use Data::Dumper qw(Dumper);
my $debug = 0;
# On ^C set a flag to die after this cycle.
my $die_soon = 0;
$|++;
my %opts = ( interval => 30, parallel => 1, noredrain => 0 );
my $rv = GetOptions(\%opts, 'help|h', 'interval|i=i', 'trackers=s',
'hosts=s', 'devices=s', 'maxpercent=s', 'minfree=i',
'fulldrain', 'parallel=i', 'noredrain');
if ($opts{help}) {
show_help(); exit;
}
# FIXME: Finish help. Perldocs?
sub show_help {
print "Usage: $0 --trackers='127.0.0.1:6001' --hosts='sto1'
--maxpercent='90'\n";
print qq{Options:
--help display this help
--trackers list of mogile trackers ex: "ip:port,ip:port"
--hosts only drain these hosts ex: "sto1,sto2" or "all"
--devices only drain these devices ex: "1,2,5,33,17"
--maxpercent ensure devices are at most this % full ex: "80.5"
--minfree minimum free space (MB) devices should have
--fulldrain drain all specified devices to 0%
--parallel max devices to drain in parallel ex: "3"
--noredrain only drain each device once during run
--interval time in seconds between drain checks and log output
};
}
$SIG{INT} = \&sigint_handler;
# Split up the tracker opt into an array.
if ($opts{trackers}) {
$opts{trackers} = [ split(/\s*,\s*/, $opts{trackers}) ];
} else {
error_die('You must supply a tracker, like: --trackers="127.0.0.1:6001"');
}
my $madm = MogileFS::Admin->new(hosts => $opts{trackers});
die "Could not connect to a tracker." unless $madm;
error_die('Must specify --maxpercent or --minfree or --fulldrain')
unless ($opts{maxpercent} || $opts{minfree} || $opts{fulldrain});
error_die('Must specify at least --hosts="all" or --devices="all"')
unless ($opts{hosts} || $opts{devices});
# Figure the requested constraints on full-ness.
# --fulldrain overrides --maxpercent to '0'... Note this might not always
# work. So we should add an extra check under --fulldrain to move onto the
# next device if it's gotten stuck.
$opts{maxpercent} = 0 if $opts{fulldrain};
error_die("We're too stupid to handle maxpercent AND minfree at the same time")
if ($opts{maxpercent} && $opts{minfree});
error_die('maxpercent should look like --maxpercent="87.3" or similar')
if ($opts{maxpercent} && $opts{maxpercent} !~ m/^\d+(\.\d+)?$/);
error_die('Set minfree to whatever number you want, but not negative or zero.')
if ($opts{minfree} && $opts{minfree} < 1);
error_die('Parallel devices to drain must be at least 1')
if ($opts{parallel} < 1);
# Set some handy defaults to make the filtering stage simpler.
$opts{maxpercent} = $opts{maxpercent} / 100 if $opts{maxpercent};
$opts{maxpercent} = 1 unless $opts{maxpercent};
$opts{minfree} = 0 unless $opts{minfree};
# We've sufficiently babysat our user for the constraints. Now we need to
# discover what devices to drain.
print "HOSTS\n" if $debug;
my $hosts_to_drain = find_hosts_to_drain();
print Dumper($hosts_to_drain) if $debug;
print "\n\n\nDEVICES\n" if $debug;
my $devs_to_drain = find_devices_to_drain($hosts_to_drain);
print Dumper($devs_to_drain) if $debug;
print "Found ", scalar @$devs_to_drain, " devices to drain...\n";
# $in_drain is a tally of the in-flight drains.
my $in_drain = 0;
# Do a pre-loop to find any currently _in_ drain...
for my $dev (@$devs_to_drain) {
$in_drain++ if ($dev->{status} eq 'drain');
}
# Now loop until all devices to filter match constraints.
my %no_redrain = ();
my %host_map = map { $_->{hostid} => $_->{hostname} } @$hosts_to_drain;
while (1) {
my %latest_devs = map { $_->{devid} => $_ } @{$madm->get_devices()};
reset_drains_and_exit(\%latest_devs) if $die_soon;
for my $dev (@$devs_to_drain) {
my $latest = $latest_devs{$dev->{devid}};
my $date = strftime "%Y/%m/%d-%H:%M:%S%z(%Z)", localtime;
if (should_drain($latest)) {
if ($latest->{status} eq 'alive' && $in_drain < $opts{parallel}) {
next if $opts{noredrain} && $no_redrain{$dev->{devid}};
print "$date marking drain for ", dev_status($latest), "\n";
$madm->change_device_state($host_map{$latest->{hostid}},
$latest->{devid}, 'drain');
die "Failed marking device into drain: " . $madm->errstr if
$madm->err;
$in_drain++;
next;
} elsif ($latest->{status} eq 'drain') {
print "$date continuing drain ", dev_status($latest), "\n";
}
} elsif ($latest->{status} eq 'drain') {
print "$date marking alive for ", dev_status($latest), "\n";
$no_redrain{$dev->{devid}}++ if $opts{noredrain};
$madm->change_device_state($host_map{$latest->{hostid}},
$latest->{devid}, 'alive');
die "Failed marking device alive: " . $madm->errstr if $madm->err;
$in_drain--;
next;
}
}
# Bail if we have no devices in drain anymore.
last unless $in_drain;
sleep $opts{interval};
}
sub reset_drains_and_exit {
my $devs = shift;
print "We were told to give up. Resetting devices to alive state...\n";
for my $dev (values %$devs) {
next unless ($dev->{status} eq 'drain');
$madm->change_device_state($host_map{$dev->{hostid}}, $dev->{devid},
'alive');
die "Failed marking device alive: " . $madm->errstr if $madm->err;
print "Marked devid ", $dev->{devid}, " alive again.\n";
}
print "Done, exiting\n";
exit;
}
sub dev_status {
my $d = shift;
my $pct = $d->{mb_used} / $d->{mb_total} * 100;
my $free = $d->{mb_total} - $d->{mb_used};
$free /= 1024;
return sprintf("[%3d] used: %3.2f%% free (G): %10.3f",
$d->{devid}, $pct, $free);
}
sub should_drain {
my $d = shift;
return 1 if ( ($opts{maxpercent} < $d->{mb_used} / $d->{mb_total} ||
$opts{minfree} > $d->{mb_total} - $d->{mb_used}) );
}
# Returns filtered list of devices to drain. Skip dead/down, or if not on a
# valid host.
# Device must be:
# - 'alive'
# - on a valid host
# - be more than %opts{maxpercent} full
# - have less than $opts{minfree} mb free
sub find_devices_to_drain {
my $h = shift;
my %hosts = map { $_->{hostid} => 1 } @$h;
my $to_filter = [ grep { $_->{status} =~ m/^(alive|drain)$/ &&
$hosts{$_->{hostid}} &&
should_drain($_)
} @{$madm->get_devices()} ];
return $to_filter if (!$opts{devices} || $opts{devices} eq 'all');
my %devs = map { $_ => 1 } split(/\s*,\s*/, $opts{devices});
return [ grep { $devs{$_->{devid}} } @$to_filter ];
}
# Return filtered list of hosts to drain. Skip dead/down.
sub find_hosts_to_drain {
my $to_filter = [ grep { $_->{status} eq 'alive' }
@{$madm->get_hosts()} ];
return $to_filter if (!$opts{hosts} || $opts{hosts} eq 'all');
my %hosts = map { $_ => 1 } split(/\s*,\s*/, $opts{hosts});
return [ grep { $hosts{$_->{hostname}} } @$to_filter ];
}
sub sigint_handler {
print "Told to die. Wait one more interval so we can stop the drains.\n";
print "Or hit ^C again to kill us anyway.\n";
$die_soon++;
}
sub error_die {
print "ERROR: ", $_[0], "\n\n";
show_help();
die;
}