Re: [gpfsug-discuss] Analyse steps if disk are down after reboot
Just to follow up on the question about where to learn why a NSD is marked down you should see a message in the GPFS log, /var/adm/ras/mmfs.log.* Regards, The Spectrum Scale (GPFS) team -- If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: "Grunenberg, Renar" To: 'gpfsug main discussion list' Date: 07/12/2018 06:01 AM Subject: Re: [gpfsug-discuss] Analyse steps if disk are down after reboot Sent by:gpfsug-discuss-boun...@spectrumscale.org Hallo Achim, hallo Simon, first thanks for your answers. I think Achims answers map these at best. The nsd-servers (only 2) for these disk were mistakenly restart in a same time window. Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: renar.grunenb...@huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas. Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. Von: gpfsug-discuss-boun...@spectrumscale.org [ mailto:gpfsug-discuss-boun...@spectrumscale.org] Im Auftrag von Achim Rehor Gesendet: Donnerstag, 12. Juli 2018 11:47 An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Analyse steps if disk are down after reboot Hi Renar, whenever an access to a NSD happens, there is a potential that the node cannot access the disk, so if the (only) NSD server is down, there will be no chance to access the disk, and the disk will be set down. If you have twintailed disks, the 'second' (or possibly some more) NSD server will be asked, switching to networked access, and in that case only if that also fails, the disk will be set to down as well. Not sure how your setup is, but if you reboot 2 NSD servers, and some client possibly did IO to a file served by just these 2, then the 'down' state would be explainable. Rebooting of an NSD server should never set a disk to down, except, he was the only one serving that NSD. Mit freundlichen Grüßen / Kind regards Achim Rehor Software Technical Support Specialist AIX/ Emea HPC Support IBM Certified Advanced Technical Expert - Power Systems with AIX TSCC Software Service, Dept. 7922 Global Technology Services Phone: +49-7034-274-7862 IBM Deutschland E-Mail: achim.re...@de.ibm.com Am Weiher 24 65451 Kelsterbach Germany IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martin Hartmann (Vorsitzender), Norbert Janzen, Stefan Lutz, Nicole Reimer, Dr. Klaus Seifert, Wolfgang Wendt Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 From:"Grunenberg, Renar" To:"'gpfsug-discuss@spectrumscale.org'" < gpfsug-discuss@spectrumscale.org> Date: 12/07/2018 10:17 Subject: [gpfsug-discuss] Analyse steps if disk are down after reboot Sent by:gpfsug-discuss-boun...@spectrumscale.org Hallo All, we see after a reboot of two NSD-Servers some disks in different filesystems are down and we don’t see why. The logs (messages, dmesg, kern,..) are saying nothing. We are on Rhel7.4 and SS 5.0.1.1. The question now, there are any log, structures in the gpfs deamon that log these situation? What was the reason why the
Re: [gpfsug-discuss] Analyse steps if disk are down after reboot
Hallo Achim, hallo Simon, first thanks for your answers. I think Achims answers map these at best. The nsd-servers (only 2) for these disk were mistakenly restart in a same time window. Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon:09561 96-44110 Telefax:09561 96-44104 E-Mail: renar.grunenb...@huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas. Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. Von: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org] Im Auftrag von Achim Rehor Gesendet: Donnerstag, 12. Juli 2018 11:47 An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Analyse steps if disk are down after reboot Hi Renar, whenever an access to a NSD happens, there is a potential that the node cannot access the disk, so if the (only) NSD server is down, there will be no chance to access the disk, and the disk will be set down. If you have twintailed disks, the 'second' (or possibly some more) NSD server will be asked, switching to networked access, and in that case only if that also fails, the disk will be set to down as well. Not sure how your setup is, but if you reboot 2 NSD servers, and some client possibly did IO to a file served by just these 2, then the 'down' state would be explainable. Rebooting of an NSD server should never set a disk to down, except, he was the only one serving that NSD. Mit freundlichen Grüßen / Kind regards Achim Rehor Software Technical Support Specialist AIX/ Emea HPC Support [cid:image001.gif@01D419D7.A9373E60] IBM Certified Advanced Technical Expert - Power Systems with AIX TSCC Software Service, Dept. 7922 Global Technology Services Phone: +49-7034-274-7862 IBM Deutschland E-Mail: achim.re...@de.ibm.com<mailto:achim.re...@de.ibm.com> Am Weiher 24 65451 Kelsterbach Germany IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martin Hartmann (Vorsitzender), Norbert Janzen, Stefan Lutz, Nicole Reimer, Dr. Klaus Seifert, Wolfgang Wendt Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 From:"Grunenberg, Renar" mailto:renar.grunenb...@huk-coburg.de>> To:"'gpfsug-discuss@spectrumscale.org'" mailto:gpfsug-discuss@spectrumscale.org>> Date:12/07/2018 10:17 Subject:[gpfsug-discuss] Analyse steps if disk are down after reboot Sent by: gpfsug-discuss-boun...@spectrumscale.org<mailto:gpfsug-discuss-boun...@spectrumscale.org> Hallo All, we see after a reboot of two NSD-Servers some disks in different filesystems are down and we don’t see why. The logs (messages, dmesg, kern,..) are saying nothing. We are on Rhel7.4 and SS 5.0.1.1. The question now, there are any log, structures in the gpfs deamon that log these situation? What was the reason why the deamon hast no access to the disks at that startup phase. Any hints are appreciated. Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: renar.grunenb...@huk-coburg.de<mailto:renar.grunenb...@huk-coburg.de> Internet: www.huk.de HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jör
Re: [gpfsug-discuss] Analyse steps if disk are down after reboot
Hi Renar, whenever an access to a NSD happens, there is a potential that the node cannot access the disk, so if the (only) NSD server is down, there will be no chance to access the disk, and the disk will be set down.If you have twintailed disks, the 'second' (or possibly some more) NSD server will be asked, switching to networked access, and in that case only if that also fails, the disk will be set to down as well. Not sure how your setup is, but if you reboot 2 NSD servers, and some client possibly did IO to a file served by just these 2, then the 'down' state would be explainable. Rebooting of an NSD server should never set a disk to down, except, he was the only one serving that NSD.Mit freundlichen Grüßen / Kind regardsAchim Rehor Software Technical Support Specialist AIX/ Emea HPC SupportIBM Certified Advanced Technical Expert - Power Systems with AIXTSCC Software Service, Dept. 7922Global Technology Services Phone:+49-7034-274-7862 IBM DeutschlandE-Mail:achim.re...@de.ibm.com Am Weiher 24 65451 Kelsterbach GermanyIBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martin Hartmann (Vorsitzender), Norbert Janzen, Stefan Lutz, Nicole Reimer, Dr. Klaus Seifert, Wolfgang Wendt Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 WEEE-Reg.-Nr. DE 99369940 From: "Grunenberg, Renar" To: "'gpfsug-discuss@spectrumscale.org'" Date: 12/07/2018 10:17Subject: [gpfsug-discuss] Analyse steps if disk are down after rebootSent by: gpfsug-discuss-boun...@spectrumscale.orgHallo All, we see after a reboot of two NSD-Servers some disks in different filesystems are down and we don’t see why. The logs (messages, dmesg, kern,..) are saying nothing. We are on Rhel7.4 and SS 5.0.1.1.The question now, there are any log, structures in the gpfs deamon that log these situation? What was the reason why the deamon hast no access to the disks at that startup phase.Any hints are appreciated. Renar GrunenbergAbteilung Informatik – BetriebHUK-COBURGBahnhofsplatz96444 CoburgTelefon:09561 96-44110Telefax:09561 96-44104E-Mail:renar.grunenb...@huk-coburg.deInternet:www.huk.deHUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in CoburgReg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021Sitz der Gesellschaft: Bahnhofsplatz, 96444 CoburgVorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas.Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen.Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben,informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.This information may contain confidential and/or privileged information.If you are not the intended recipient (or have received this information in error) please notify thesender immediately and destroy this information.Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] Analyse steps if disk are down after reboot
How are the disks attached? We have some IB/SRP storage that is sometimes a little slow to appear in multipath and have seen this in the past (we since set autoload=off and always check multipath before restarting GPFS on the node). Simon From: on behalf of "renar.grunenb...@huk-coburg.de" Reply-To: "gpfsug-discuss@spectrumscale.org" Date: Thursday, 12 July 2018 at 09:17 To: "gpfsug-discuss@spectrumscale.org" Subject: [gpfsug-discuss] Analyse steps if disk are down after reboot Hallo All, we see after a reboot of two NSD-Servers some disks in different filesystems are down and we don’t see why. The logs (messages, dmesg, kern,..) are saying nothing. We are on Rhel7.4 and SS 5.0.1.1. The question now, there are any log, structures in the gpfs deamon that log these situation? What was the reason why the deamon hast no access to the disks at that startup phase. Any hints are appreciated. Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: renar.grunenb...@huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas. Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] Analyse steps if disk are down after reboot
Hallo All, we see after a reboot of two NSD-Servers some disks in different filesystems are down and we don’t see why. The logs (messages, dmesg, kern,..) are saying nothing. We are on Rhel7.4 and SS 5.0.1.1. The question now, there are any log, structures in the gpfs deamon that log these situation? What was the reason why the deamon hast no access to the disks at that startup phase. Any hints are appreciated. Renar Grunenberg Abteilung Informatik – Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon:09561 96-44110 Telefax:09561 96-44104 E-Mail: renar.grunenb...@huk-coburg.de Internet: www.huk.de HUK-COBURG Haftpflicht-Unterstützungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-Jürgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Herøy, Dr. Jörg Rheinländer (stv.), Sarah Rössler, Daniel Thomas. Diese Nachricht enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss