Hi All,
And I’m not really down, but it is a rainy Monday morning here and GPFS did
give me a scare in the last hour, so I thought that was a funny subject line.
So I have a >1 PB filesystem with 3 pools: 1) the system pool, which contains
metadata only, 2) the data pool, which is where all I/O goes to by default,
and 3) the capacity pool, which is where old crap gets migrated to.
I logged on this morning to see an alert that my data pool was 100% full. I
ran an mmdf from the cluster manager and, sure enough:
(pool total) 509.3T 0 ( 0%)
0 ( 0%)
I immediately tried copying a file to there and it worked, so I figured GPFS
must be failing writes over to the capacity pool, but an mmlsattr on the file I
copied showed it being in the data pool. Hmmm.
I also noticed that “df -h” said that the filesystem had 399 TB free, while
mmdf said it only had 238 TB free. Hmmm.
So after some fruitless poking around I decided that whatever was going to
happen, I should kill the mmrestripefs I had running on the capacity pool … let
me emphasize that … I had a restripe running on the capacity pool only (via the
“-P” option to mmrestripefs) but it was the data pool that said it was 100%
full.
I’m sure many of you have already figured out where this is going … after
killing the restripe I ran mmdf again and:
(pool total) 509.3T 159T ( 31%)
1.483T ( 0%)
I have never seen anything like this before … any ideas, anyone? PMR time?
Thanks!
Kevin
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss