I have a question about file-systems with replicated an non replicated data.
We have a file-system where metadata is set to copies=2 and data copies=2, we
then use a placement policy to selectively replicate some data only once based
on file-set. We also place the non-replicated data into a specific pool
(6tnlsas) to ensure we know where it is placed.
My understanding was that in doing this, if we took the disks with the non
replicated data offline, we’d still have the FS available for users as the
metadata is replicated. Sure accessing a non-replicated data file would give an
IO error, but the rest of the FS should be up.
We had a situation today where we wanted to take stg01 offline today, so tried
using mmchdisk stop -d …. Once we got to about disk stg01-01_12_12, GPFS would
refuse to stop any more disks and complain about too many disks, similarly if
we shutdown the NSD servers hosting the disks, the filesystem would have an
SGPanic and force unmount.
First, am I correct in thinking that a FS with non-replicated data, but
replicated metadata should still be accessible (not the non-replicated data)
when the LUNS hosting it are down?
If so, any suggestions why my FS is panic-ing when we take down the one set of
disks?
I thought at first we had some non-replicated metadata, tried a mmrestripefs -R
–metadata-only to force it to ensure 2 replicas, but this didn’t help.
Running 5.0.0.2 on the NSD server nodes.
(First time we went round this we didn’t have a FS descriptor disk, but you can
see below that we added this)
Thanks
Simon
[root@nsd01 ~]# mmlsdisk castles -L
disk driver sector failure holds holds
storage
name type size group metadata data status
availability disk id pool remarks
------------ -------- ------ ----------- -------- ----- -------------
------------ ------- ------------ ---------
CASTLES_GPFS_DESCONLY01 nsd 512 310 no no ready
up 1 system desc
stg01-01_3_3 nsd 4096 210 no yes ready down
4 6tnlsas
stg01-01_4_4 nsd 4096 210 no yes ready down
5 6tnlsas
stg01-01_5_5 nsd 4096 210 no yes ready down
6 6tnlsas
stg01-01_6_6 nsd 4096 210 no yes ready down
7 6tnlsas
stg01-01_7_7 nsd 4096 210 no yes ready down
8 6tnlsas
stg01-01_8_8 nsd 4096 210 no yes ready down
9 6tnlsas
stg01-01_9_9 nsd 4096 210 no yes ready down
10 6tnlsas
stg01-01_10_10 nsd 4096 210 no yes ready down
11 6tnlsas
stg01-01_11_11 nsd 4096 210 no yes ready down
12 6tnlsas
stg01-01_12_12 nsd 4096 210 no yes ready down
13 6tnlsas
stg01-01_13_13 nsd 4096 210 no yes ready down
14 6tnlsas
stg01-01_14_14 nsd 4096 210 no yes ready down
15 6tnlsas
stg01-01_15_15 nsd 4096 210 no yes ready down
16 6tnlsas
stg01-01_16_16 nsd 4096 210 no yes ready down
17 6tnlsas
stg01-01_17_17 nsd 4096 210 no yes ready down
18 6tnlsas
stg01-01_18_18 nsd 4096 210 no yes ready down
19 6tnlsas
stg01-01_19_19 nsd 4096 210 no yes ready down
20 6tnlsas
stg01-01_20_20 nsd 4096 210 no yes ready down
21 6tnlsas
stg01-01_21_21 nsd 4096 210 no yes ready down
22 6tnlsas
stg01-01_ssd_54_54 nsd 4096 210 yes no ready
down 23 system
stg01-01_ssd_56_56 nsd 4096 210 yes no ready
down 24 system
stg02-01_0_0 nsd 4096 110 no yes ready up
25 6tnlsas
stg02-01_1_1 nsd 4096 110 no yes ready up
26 6tnlsas
stg02-01_2_2 nsd 4096 110 no yes ready up
27 6tnlsas
stg02-01_3_3 nsd 4096 110 no yes ready up
28 6tnlsas
stg02-01_4_4 nsd 4096 110 no yes ready up
29 6tnlsas
stg02-01_5_5 nsd 4096 110 no yes ready up
30 6tnlsas
stg02-01_6_6 nsd 4096 110 no yes ready up
31 6tnlsas
stg02-01_7_7 nsd 4096 110 no yes ready up
32 6tnlsas
stg02-01_8_8 nsd 4096 110 no yes ready up
33 6tnlsas
stg02-01_9_9 nsd 4096 110 no yes ready up
34 6tnlsas
stg02-01_10_10 nsd 4096 110 no yes ready up
35 6tnlsas
stg02-01_11_11 nsd 4096 110 no yes ready up
36 6tnlsas
stg02-01_12_12 nsd 4096 110 no yes ready up
37 6tnlsas
stg02-01_13_13 nsd 4096 110 no yes ready up
38 6tnlsas
stg02-01_14_14 nsd 4096 110 no yes ready up
39 6tnlsas
stg02-01_15_15 nsd 4096 110 no yes ready up
40 6tnlsas
stg02-01_16_16 nsd 4096 110 no yes ready up
41 6tnlsas
stg02-01_17_17 nsd 4096 110 no yes ready up
42 6tnlsas
stg02-01_18_18 nsd 4096 110 no yes ready up
43 6tnlsas
stg02-01_19_19 nsd 4096 110 no yes ready up
44 6tnlsas
stg02-01_20_20 nsd 4096 110 no yes ready up
45 6tnlsas
stg02-01_21_21 nsd 4096 110 no yes ready up
46 6tnlsas
stg02-01_ssd_22_22 nsd 4096 110 yes no ready up
47 system desc
stg02-01_ssd_23_23 nsd 4096 110 yes no ready up
48 system
stg02-01_ssd_24_24 nsd 4096 110 yes no ready up
49 system
stg02-01_ssd_25_25 nsd 4096 110 yes no ready up
50 system
stg01-01_22_22 nsd 4096 210 no yes ready up
51 6tnlsasnonrepl desc
stg01-01_23_23 nsd 4096 210 no yes ready up
52 6tnlsasnonrepl
stg01-01_24_24 nsd 4096 210 no yes ready up
53 6tnlsasnonrepl
stg01-01_25_25 nsd 4096 210 no yes ready up
54 6tnlsasnonrepl
stg01-01_26_26 nsd 4096 210 no yes ready up
55 6tnlsasnonrepl
stg01-01_27_27 nsd 4096 210 no yes ready up
56 6tnlsasnonrepl
stg01-01_31_31 nsd 4096 210 no yes ready up
58 6tnlsasnonrepl
stg01-01_32_32 nsd 4096 210 no yes ready up
59 6tnlsasnonrepl
stg01-01_33_33 nsd 4096 210 no yes ready up
60 6tnlsasnonrepl
stg01-01_34_34 nsd 4096 210 no yes ready up
61 6tnlsasnonrepl
stg01-01_35_35 nsd 4096 210 no yes ready up
62 6tnlsasnonrepl
stg01-01_36_36 nsd 4096 210 no yes ready up
63 6tnlsasnonrepl
stg01-01_37_37 nsd 4096 210 no yes ready up
64 6tnlsasnonrepl
stg01-01_38_38 nsd 4096 210 no yes ready up
65 6tnlsasnonrepl
stg01-01_39_39 nsd 4096 210 no yes ready up
66 6tnlsasnonrepl
stg01-01_40_40 nsd 4096 210 no yes ready up
67 6tnlsasnonrepl
stg01-01_41_41 nsd 4096 210 no yes ready up
68 6tnlsasnonrepl
stg01-01_42_42 nsd 4096 210 no yes ready up
69 6tnlsasnonrepl
stg01-01_43_43 nsd 4096 210 no yes ready up
70 6tnlsasnonrepl
stg01-01_44_44 nsd 4096 210 no yes ready up
71 6tnlsasnonrepl
stg01-01_45_45 nsd 4096 210 no yes ready up
72 6tnlsasnonrepl
stg01-01_46_46 nsd 4096 210 no yes ready up
73 6tnlsasnonrepl
stg01-01_47_47 nsd 4096 210 no yes ready up
74 6tnlsasnonrepl
stg01-01_48_48 nsd 4096 210 no yes ready up
75 6tnlsasnonrepl
stg01-01_49_49 nsd 4096 210 no yes ready up
76 6tnlsasnonrepl
stg01-01_50_50 nsd 4096 210 no yes ready up
77 6tnlsasnonrepl
stg01-01_51_51 nsd 4096 210 no yes ready up
78 6tnlsasnonrepl
Number of quorum disks: 3
Read quorum value: 2
Write quorum value: 2
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss