We do not have as many client nodes, but we have an extensive Ganglia
configuration that monitors all of the nodes on our network.

For the client nodes we also run a script that pushes stats into Ganglia
using 'mmpmon'.

Using this we have been able to locate problem machines a lot quicker.

I have attached the script, and is released on a 'it works for me' term.

We run it every minute from cron.

Vince.

--

Vincent Andrews
NOC,
European Way,
Southampton ,
SO14 3ZH
Ext. 27616
External 023 80597616


This e-mail (and any attachments) is confidential and intended solely for
the use of the individual or entity to whom it is addressed. Both NERC and
the University of Southampton (who operate NOCS as a collaboration) are
subject to the Freedom of Information Act 2000. The information contained
in this e-mail and any reply you make may be disclosed unless it is
legally from disclosure. Any material supplied to NOCS may be stored in
the electronic records management system of either the University or NERC
as appropriate.



On 09/12/2013 21:21, "Alex Chekholko" <[email protected]> wrote:

>Hi Richard,
>
>For IB traffic, you can use 'collectl -sx'
>http://collectl.sourceforge.net/Infiniband.html
>or else mmpmon (which is what 'dstat --gpfs' uses underneath anyway)
>
>If your other NSDs are full, then of course all writes will go to the
>empty NSDs.  And then reading those new files your performance will be
>limited to just the new NSDs.
>
>
>Regards,
>Alex
>
>On 12/09/2013 01:05 PM, Richard Lefebvre wrote:
>> Hi Alex,
>>
>> I should have mention that my GPFS network is done through
>> infiniband/RDMA, so looking at the TCP probably won't work. I will try
>> to see if the traffic can be seen through ib0 (instead of eth0), but I
>> have my doubts.
>>
>> As for the placement. The file system was 95% full when I added the new
>> NSDs. I know that what is waiting now from the waiters commands is the
>> to the 2 NSDs:
>>
>> waiting 0.791707000 seconds, NSDThread: for I/O completion on disk d9
>>
>> I have added more NSDs since then but the waiting is still on the 2
>> disks. None of the others.
>>
>> Richard
>>
>> On 12/09/2013 02:52 PM, Alex Chekholko wrote:
>>> Hi Richard,
>>>
>>> I would just use something like 'iftop' to look at the traffic between
>>> the nodes.  Or 'collectl'.  Or 'dstat'.
>>>
>>> e.g. dstat -N eth0 --gpfs --gpfs-ops --top-cpu-adv --top-io 2 10
>>> http://dag.wiee.rs/home-made/dstat/
>>>
>>> For the NSD balance question, since GPFS stripes the blocks evenly
>>> across all the NSDs, they will end up balanced over time.  Or you can
>>> rebalance manually with 'mmrestripefs -b' or similar.
>>>
>>> It is unlikely that particular files ended up on a single NSD, unless
>>> the other NSDs are totally full.
>>>
>>> Regards,
>>> Alex
>>>
>>> On 12/06/2013 04:31 PM, Richard Lefebvre wrote:
>>>> Hi,
>>>>
>>>> I'm looking for a way to see which node (or nodes) is having an impact
>>>> on the gpfs server nodes which is slowing the whole file system? What
>>>> happens, usually, is a user is doing some I/O that doesn't fit the
>>>> configuration of the gpfs file system and the way it was explain on
>>>>how
>>>> to use it efficiently.  It is usually by doing a lot of unbuffered
>>>>byte
>>>> size, very random I/O on the file system that was made for large files
>>>> and large block size.
>>>>
>>>> My problem is finding out who is doing that. I haven't found a way to
>>>> pinpoint the node or nodes that could be the source of the problem,
>>>>with
>>>> over 600 client nodes.
>>>>
>>>> I tried to use "mmlsnodes -N waiters -L" but there is too much waiting
>>>> that I cannot pinpoint on something.
>>>>
>>>> I must be missing something simple. Anyone got any help?
>>>>
>>>> Note: there is another thing I'm trying to pinpoint. A temporary
>>>> imbalance was created by adding a new NSD. It seems that a group of
>>>> files have been created on that same NSD and a user keeps hitting that
>>>> NSD causing a high load.  I'm trying to pinpoint the origin of that
>>>>too.
>>>> At least until everything is balance back. But will balancing spread
>>>> those files since they are already on the most empty NSD?
>>>>
>>>> Richard
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at gpfsug.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>
>--
>Alex Chekholko [email protected] 347-401-4860
>_______________________________________________
>gpfsug-discuss mailing list
>gpfsug-discuss at gpfsug.org
>http://gpfsug.org/mailman/listinfo/gpfsug-discuss


This message (and any attachments) is for the recipient only. NERC is subject 
to the Freedom of Information Act 2000 and the contents of this email and any 
reply you make may be disclosed by NERC unless it is exempt from release under 
the Act. Any material supplied to NERC may be stored in an electronic records 
management system.
#!/usr/bin/perl

###########################################################################
# Author: Vincent Andrews
# Collects stats from GPFS mmpmon to display Client node information (io_s).
###########################################################################

if [ -f /usr/bin/gmetric ]; then
gmetric_command="/usr/bin/gmetric"
else
gmetric_command="/bin/gmetric"
fi

$mmpmon_command = "/usr/lpp/mmfs/bin/mmpmon";

if ( ! -x $gmetric_command ) {
	die("Gmetric command is not executable. Exiting...");
}

if ( ! -x $mmpmon_command ) {
	die("Not a GPFS Cluster member. Exiting...");
}

# Check out the nfs_stat has below for the list of all metrics
# and their appropriate index
@which_metrics = split(/ /, "12 14 16 18 20 22 24 26");

# Where to store the last stats file
$tmp_dir_base="/var/tmp/gpfs_node_stats";

$tmp_stats_file=$tmp_dir_base . "/" . "gpfs_server_stats";
$tmp_stats_file_new=$tmp_dir_base . "/" . "gpfs_server_stats_new";
$metric_prefix = "_io_s_";

###########################################################################
# This is the order of metrics from mmpmon io_s 
###########################################################################
%gpfs_stat = (
	12 => "Bytes_Read",
	14 => "Bytes_Written",
	16 => "Open-Create_Requests",
	18 => "Close_Requests",
	20 => "App_Read_Requests",
	22 => "App_Write_Requets",
	24 => "ReadDir_Requests",
	26 => "Inode_Updates"
);

# If the tmp directory doesn't exit create it
if ( ! -d $tmp_dir_base ) {
	system("mkdir -p $tmp_dir_base");
}

###############################################################################
# We need to store a baseline with statistics. If it's not there let's dump 
# it into the file. Don't do anything else
###############################################################################
if ( ! -f $tmp_stats_file ) {
	print "Creating baseline. No output this cycle\n";
	system("/usr/lpp/mmfs/bin/mmpmon -p -i /root/node-cmd | tr -s ' ' ' ' > $tmp_stats_file");
} else {

	# Let's read in the file from the last poll
	open(OLDGPFSDSTATUS, "< $tmp_stats_file");
	
	while(<OLDGPFSDSTATUS>)
	{
		my($line) = $_;
		chomp($line);
			@old_stats = split(/ /,$line);
			last;
	}

	
	# Get the time stamp when the stats file was last modified
	$old_time = (stat $tmp_stats_file)[9];
	close(OLDGPFSDSTATUS);

	system("/usr/lpp/mmfs/bin/mmpmon -p -i /root/node-cmd | tr -s ' ' ' ' > $tmp_stats_file_new");

	open(GPFSDSTATUS, "< $tmp_stats_file_new");
	
	$new_time = time(); 
	
	while(<GPFSDSTATUS>)
	{
		my($line) = $_;
		chomp($line);
			@new_stats = split(/ /,$line);
			system("echo '$line' >  $tmp_stats_file");
			last;
	}
	
	close(GPFSDSTATUS);

       # Time difference between this poll and the last poll
        my $time_difference = $new_time - $old_time;
        if ( $time_difference < 1 ) {
                die("Time difference can't be less than 1");
        }

	# Calculate deltas and send them to ganglia
	for ( $i = 0 ; $i <= $#which_metrics; $i++ ) {
		my $metric = $which_metrics[$i];
		my $delta = $new_stats[$metric] - $old_stats[$metric];
#		print " metric=$metric  new=$new_stats[$metric] old=$old_stats[$metric] delta=$delta" ;
		my $rate = int($delta / $time_difference);

               if ( $rate < 0 ) {
                        print "Something is fishy. Rate for " . $metric . " shouldn't be negative. Perhaps counters were reset. Doing nothing";
                } else {
#			print "$gpfs_stat{$metric} = $rate / sec\n";
                      system("$gmetric_command --type=uint16  --name=$metric_prefix$gpfs_stat{$metric} --value=$rate");
#	print "$gmetric_command --type=uinit16  --name=$metric_prefix$gpfs_stat{$metric} --value=$rate";
                }
        }
}
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to