Re: [gpfsug-discuss] mmfsadm test pit

Aaron Knister Tue, 16 Aug 2016 14:55:52 -0700

Thanks Marc! That's incredibly helpful info. I'll uh, not use the testpit command :)


-Aaron


On 8/16/16 5:09 PM, Marc A Kaplan wrote:

I was surprised to read that Ctrl-C did not really kill restripe.   It's
supposed to!  If it doesn't that's a bug.

I ran this by my expert within IBM and he wrote to me:

First of all a "PIT job" such as restripe, deldisk, delsnapshot, and
such should be easy to stop by ^C the management program that started
them.  The SG manager daemon holds open a socket to the client program
for the purposes of sending command output, progress updates, error
messages and the like.  The PIT code checks this socket periodically and
aborts the PIT process cleanly if the socket is closed.  If this cleanup
doesn't occur, it is a bug and should be worth reporting.  However,
there's no exact guarantee on how quickly each thread on the SG mgr will
notice and then how quickly the helper nodes can be stopped and so
forth.  The interval between socket checks depends among other things on
how long it takes to process each file, if there are a few very large
files, the delay can be significant.  In the limiting case, where most
of the FS storage is contained in a few files, this mechanism doesn't
work [elided] well.  So it can be quite involved and slow sometimes to
wrap up a PIT operation.

The simplest way to determine if the command has really stopped is with
the mmdiag --commands issued on the SG manager node.  This shows running
commands with the command line, start time, socket, flags, etc.  After
^Cing the client program, the entry here should linger for a while, then
go away.  When it exits you'll see an entry in the GPFS log file where
it fails with err 50.  If this doesn't stop the command after a while,
it is worth looking into.

If the command wasn't issued on the SG mgr node and you can't find the
where the client command is running, the socket is still a useful hint.
 While tedious, it should be possible to trace this socket back to node
where that command was originally run using netstat or equivalent.
 Poking around inside a GPFS internaldump will also provide clues; there
should be an outstanding  sgmMsgSGClientCmd command listed in the dump
tscomm section.  Once you find it, just 'kill `pidof mmrestripefs` or
similar.

I'd like to warn the OP away from mmfsadm test pit.  These commands are
of course unsupported and unrecommended for any purpose (even internal
test and development purposes, as far as I know).  You are definitely
working without a net there.  When I was improving the integration
between PIT and snapshot quiesce a few years ago, I looked into this and
couldn't figure out how to (easily) make these stop and resume commands
safe to use, so as far as I know they remain unsafe.  The list command,
however, is probably fairly okay; but it would probably be better to use
mmfsadm saferdump pit.





From:        Aaron Knister <[email protected]>
To:        <[email protected]>
Date:        08/15/2016 10:49 PM
Subject:        [gpfsug-discuss] mmfsadm test pit
Sent by:        [email protected]
------------------------------------------------------------------------



I just discovered this interesting gem poking at mmfsadm:

 test pit fsname list|suspend|status|resume|stop [jobId]

There have been times where I've kicked off a restripe and either
intentionally or accidentally ctrl-c'd it only to realize that many
times it's disappeared into the ether and is still running. The only way
I've known so far to stop it is with a chgmgr.

A far more painful instance happened when I ran a rebalance on an fs
w/more than 31 nsds using more than 31 pit workers and hit *that* fun
APAR which locked up access for a single filesystem to all 3.5k nodes.
We spent 48 hours round the clock rebooting nodes as jobs drained to
clear it up. I would have killed in that instance for a way to cancel
the PIT job (the chmgr trick didn't work). It looks like you might
actually be able to do this with mmfsadm, although how wise this is, I
do not know (kinda curious about that).

Here's an example. I kicked off a restripe and then ctrl-c'd it on a
client node. Then ran these commands from the fs manager:

root@loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list
JobId 785979015170 PitJobStatus PIT_JOB_RUNNING progress 0.00
debug: statusListP D40E2C70

root@loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal stop
785979015170
debug: statusListP 0

root@loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list
JobId 785979015170 PitJobStatus PIT_JOB_STOPPING progress 4.01
debug: statusListP D4013E70

... some time passes ...

root@loremds19:~ # /usr/lpp/mmfs/bin/mmfsadm test pit tlocal list
debug: statusListP 0

Interesting.

-Aaron

--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] mmfsadm test pit

Reply via email to