Hallo All,
we fixed our Problem here with Spectrum Scale Support. The fixing cmd were 
‚mmcommon recoverfs tsmconf‘ and “tsdeldisk tsmconf -d "nsd_g4_tsmconf". The 
final reason
for this problem, if I want to delete a disk in a filesystem all disk must be 
reachable from the requesting host. In our config the NSD-Server had no 
NSD-Server Definitions and the
Quorum Buster Node had no access to the SAN attached disk.
A Recommendation from my site here are:
This should be documented for a high available config with a 3 side 
implementation, or the cmds that want to update the nsd-descriptors for each 
disk should check are any disk reachable and don’t do a SG-Panic.

Regards Renar


Renar Grunenberg
Abteilung Informatik – Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:        09561 96-44110
Telefax:        09561 96-44104
E-Mail: [email protected]
Internet:       www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands 
a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav 
Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.
________________________________
Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte 
Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich 
erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist 
nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in 
error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this 
information is strictly forbidden.
________________________________
Von: Grunenberg, Renar
Gesendet: Mittwoch, 4. Juli 2018 07:47
An: '[email protected]' <[email protected]>
Betreff: Filesystem Operation error

Hallo All,
follow a short story from yesterday on Version 5.0.1.1. We had a 3 - Node 
cluster (2 Nodes for IO and the third for a quorum Buster function).
A Admin make a mistake an take a delete of the 3 Node (VM). We restored ist 
with a VM Snapshot no Problem. The only point here we lost complete
7 desconly disk. We defined new one and want to delete this disk with 
mmdeldisk. On 6 Filesystems no problem but one has now a Problem.
We delete this disk finaly with mmdeldisk fsname -p. And we see now after a 
successfully mmdelnsd the old disk already in following display.

mmlsdisk tsmconf -L
disk                      driver   sector     failure holds    holds            
                        storage
name                      type       size       group metadata data  status     
   availability disk id pool         remarks
------------              -------- ------ ----------- -------- ----- 
------------- ------------ ------- ------------ ---------
nsd_tsmconf001_DSK20 nsd         512           0 Yes      Yes   ready         
up                 1 system        desc
nsd_g4_tsmconf       nsd         512           2 No       No    removing refs 
down               2 system
nsd_tsmconf001_DSK70 nsd         512           1 Yes      Yes   ready         
up                 3 system        desc
nsd_g4_tsmconf1      nsd         512           2 No       No    ready         
up                 4 system        desc

After that all fs-cmd geneate a fs operation error here like this.
Error=MMFS_SYSTEM_UNMOUNT, ID=0xC954F85D, Tag=3882673:   Unrecoverable file 
system operation error.  Status code 65536.   Volume tsmconf
Questions:
1. What does this mean ‘removing refs’. Now we don’t have the possibility to 
handle these disk. The disk itself is no more existend, but in the stripegroup 
a referenz is available.
nsd_g4_tsmconf: uid 0A885085:577BB637, status ReferencesBeingRemoved, 
availability Unavailable,
             created on node 10.136.80.133, Tue Jul  5 15:29:27 2016
             type 'nsd', sector size 512, failureConfigVersion 424
             quorum weight {0,0}, failure group: id 2, fg index 1
             locality group: id 2, lg index 1
             failureGroupStrP: (2), rackId 2, locationId 0, extLgId 0
             nSectors 528384 (0:81000) (258 MB), inode0Sector 131072
             alloc region: no of bits 0, seg num -1, offset 0, len 72
             suballocator 0x18015B8A7A4 type 0 nBits 32 subSize 0 dataOffset 4
               nRows 0 len/off:
             storage pool: 0
             holds nothing
             sectors past efficient device boundary: 0
             isFenced: 1
             start Region No: -1 end Region No:-1
             start AllocMap Record: -1
2. Are there any cmd to handle these?
3. Where can I find the Status code 65536?

A PMR is also open.

Any Hints?

Regards Renar
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to