Hallo All, we fixed our Problem here with Spectrum Scale Support. The fixing cmd were ‚mmcommon recoverfs tsmconf‘ and “tsdeldisk tsmconf -d "nsd_g4_tsmconf". The final reason for this problem, if I want to delete a disk in a filesystem all disk must be reachable from the requesting host. In our config the NSD-Server had no NSD-Server Definitions and the Quorum Buster Node had no access to the SAN attached disk. A Recommendation from my site here are: This should be documented for a high available config with a 3 side implementation, or the cmds that want to update the nsd-descriptors for each disk should check are any disk reachable and don’t do a SG-Panic.
Regards Renar Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: [email protected] Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas. ________________________________ Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: Grunenberg, Renar Gesendet: Mittwoch, 4. Juli 2018 07:47 An: '[email protected]' <[email protected]> Betreff: Filesystem Operation error Hallo All, follow a short story from yesterday on Version 5.0.1.1. We had a 3 - Node cluster (2 Nodes for IO and the third for a quorum Buster function). A Admin make a mistake an take a delete of the 3 Node (VM). We restored ist with a VM Snapshot no Problem. The only point here we lost complete 7 desconly disk. We defined new one and want to delete this disk with mmdeldisk. On 6 Filesystems no problem but one has now a Problem. We delete this disk finaly with mmdeldisk fsname -p. And we see now after a successfully mmdelnsd the old disk already in following display. mmlsdisk tsmconf -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- nsd_tsmconf001_DSK20 nsd 512 0 Yes Yes ready up 1 system desc nsd_g4_tsmconf nsd 512 2 No No removing refs down 2 system nsd_tsmconf001_DSK70 nsd 512 1 Yes Yes ready up 3 system desc nsd_g4_tsmconf1 nsd 512 2 No No ready up 4 system desc After that all fs-cmd geneate a fs operation error here like this. Error=MMFS_SYSTEM_UNMOUNT, ID=0xC954F85D, Tag=3882673: Unrecoverable file system operation error. Status code 65536. Volume tsmconf Questions: 1. What does this mean ‘removing refs’. Now we don’t have the possibility to handle these disk. The disk itself is no more existend, but in the stripegroup a referenz is available. nsd_g4_tsmconf: uid 0A885085:577BB637, status ReferencesBeingRemoved, availability Unavailable, created on node 10.136.80.133, Tue Jul 5 15:29:27 2016 type 'nsd', sector size 512, failureConfigVersion 424 quorum weight {0,0}, failure group: id 2, fg index 1 locality group: id 2, lg index 1 failureGroupStrP: (2), rackId 2, locationId 0, extLgId 0 nSectors 528384 (0:81000) (258 MB), inode0Sector 131072 alloc region: no of bits 0, seg num -1, offset 0, len 72 suballocator 0x18015B8A7A4 type 0 nBits 32 subSize 0 dataOffset 4 nRows 0 len/off: storage pool: 0 holds nothing sectors past efficient device boundary: 0 isFenced: 1 start Region No: -1 end Region No:-1 start AllocMap Record: -1 2. Are there any cmd to handle these? 3. Where can I find the Status code 65536? A PMR is also open. Any Hints? Regards Renar
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
