Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
Thanks! I’m looking forward to upgrading our CES nodes and resuming work on the project. ~jonathon On 3/23/17, 8:24 AM, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Olaf Weiser" <gpfsug-discuss-boun...@spectrumscale.org on behalf of olaf.wei...@de.ibm.com> wrote: the issue is fixed, an APAR will be released soon - IV93100 From:Olaf Weiser/Germany/IBM@IBMDE To:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Cc:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Date: 01/31/2017 11:47 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org Yeah... depending on the #nodes you 're affected or not. . So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM Verse Simon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di. 31.01.2017 21:07Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes. According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu] Sent: 31 January 2017 17:47 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Yeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it’s only in CES. I suspect there just haven’t been that many people exporting CES out of an HPC cluster environment. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 10:45 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes I ll open a pmr here for my env ... the issue may hurt you in a ces env. only... but needs to be fixed in core gpfs.base i thi k Gesendet von IBM Verse Jonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Jonathon A Anderson" <jonathon.ander...@colorado.edu> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 17:32 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes No, I’m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don’t have “protocol node” support, so they’ve pushed back on supporting this as an overall CES-rooted effort. I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR? Thanks. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 8:42 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ok.. so obviously ... it seems , that we have several issues.. the 3983 characters is obviously a defect have you already raised a PMR , if so , can you send me the number ? From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date:01/31/2017 04:14 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org The tail isn
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
the issue is fixed, an APAR will be released soon - IV93100From: Olaf Weiser/Germany/IBM@IBMDETo: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Date: 01/31/2017 11:47 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected or not. .So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM VerseSimon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di. 31.01.2017 21:07Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodesWe use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes.According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken.SimonFrom: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu]Sent: 31 January 2017 17:47To: gpfsug main discussion listSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesYeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it’s only in CES. I suspect there just haven’t been that many people exporting CES out of an HPC cluster environment.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Tuesday, January 31, 2017 at 10:45 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesI ll open a pmr here for my env ... the issue may hurt you in a ces env. only... but needs to be fixed in core gpfs.base i thi kGesendet von IBM VerseJonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ---Von:"Jonathon A Anderson" <jonathon.ander...@colorado.edu>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di. 31.01.2017 17:32Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodesNo, I’m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don’t have “protocol node” support, so they’ve pushed back on supporting this as an overall CES-rooted effort.I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR?Thanks.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Tuesday, January 31, 2017 at 8:42 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesok.. so obviously ... it seems , that we have several issues..the 3983 characters is obviously a defecthave you already raised a PMR , if so , can you send me the number ?From: Jonathon A Anderson <jonathon.ander...@colorado.edu>To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: 01/31/2017 04:14 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgThe tail isn’t the issue; that’ my addition, so that I didn’t have to paste the hundred or so line nodelist into the thread.The actual command istsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefileBut you can see in my tailed output that the last hostname listed is cut-off halfway through the hostname. Less obvious in the example, but true, is the fact that it’s only showing the first 120 hosts, when we have 403 nodes in our gpfs cluster.[root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | wc -l120[root@sgate2 ~]# mmlscluster | grep '\-opa' | wc -l403Perhaps more explicitly, it looks like `tsctl shownodes up` can only transmit 3983 characters.[root@sgate2 ~]# tsctl shownodes up | wc -c3983Again, I’m convinced this is a bug not only because the command doesn’t actually produce a list of all of the up nodes in our cluster; but because the last name listed is incomplete.[root@sgate2 ~]# tsctl shownodes up |
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
I was thinking that whether or not CES knows your nodes are up or not is dependent on how recently they were added to the cluster; but I’m starting to wonder if it’s dependent on the order in which nodes are brought up. Presumably you are running your CES nodes in a GPFS cluster with a large number of nodes? What happens if you bring your CES nodes up earlier (e.g., before your compute nodes)? ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of "mark.b...@siriuscom.com" <mark.b...@siriuscom.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Thursday, February 9, 2017 at 7:40 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Has any headway been made on this issue? I just ran into it as well. The CES ip addresses just disappeared from my two protocol nodes (4.2.2.0). From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Thursday, February 2, 2017 at 12:02 PM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes pls contact me directly olaf.wei...@de.ibm.com Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform, --- IBM Deutschland IBM Allee 1 71139 Ehningen Phone: +49-170-579-44-66 E-Mail: olaf.wei...@de.ibm.com --- IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: 02/02/2017 06:45 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org Any chance I can get that PMR# also, so I can reference it in my DDN case? ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Wednesday, February 1, 2017 at 2:28 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM Verse Mathias Dietz --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Mathias Dietz" <mdi...@de.ibm.com> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Mi. 01.02.2017 10:05 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes >I ll open a pmr here for my env ... the issue may hurt you inralf a ces env. >only... but needs to be fixed in core gpfs.base i think Thanks for opening the PMR. The problem is inside the gpfs base code and we are working on a fix right now. In the meantime until the fix is available we will use the PMR to propose/discuss potential work arounds. Mit freundlichen Grüßen / Kind regards Mathias Dietz Spectrum Scale - Release Lead Architect (4.2.X Release) System Health and Problem Determination Architect IBM Certified Software Engineer -- IBM Deutschland Hechtsheimer Str. 2 55131 Mainz Phone: +49-6131-84-2027 Mobile: +49-15152801035 E-Mail: mdi...@de.ibm.com -- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From:Olaf Weiser/Germany/IBM@IBMDE To:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Date:01/31/2017 11:47 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org ___
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
an APAR number / fix was created at the end of last week .. so for your environment.. easily open a PMR , so that you 'll get the fix for your installed level immediately , once it ll be releasedFrom: "mark.b...@siriuscom.com" <mark.b...@siriuscom.com>To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: 02/09/2017 03:40 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgHas any headway been made on this issue? I just ran into it as well. The CES ip addresses just disappeared from my two protocol nodes (4.2.2.0). From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Thursday, February 2, 2017 at 12:02 PMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes pls contact me directly olaf.wei...@de.ibm.comMit freundlichen Grüßen / Kind regards Olaf WeiserEMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From: Jonathon A Anderson <jonathon.ander...@colorado.edu>To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: 02/02/2017 06:45 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgAny chance I can get that PMR# also, so I can reference it in my DDN case? ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Wednesday, February 1, 2017 at 2:28 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM VerseMathias Dietz --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Mathias Dietz" <mdi...@de.ibm.com>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Mi. 01.02.2017 10:05Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes >I ll open a pmr here for my env ... the issue may hurt you inralf a ces env. only... but needs to be fixed in core gpfs.base i thinkThanks for opening the PMR.The problem is inside the gpfs base code and we are working on a fix right now.In the meantime until the fix is available we will use the PMR to propose/discuss potential work arounds.Mit freundlichen Grüßen / Kind regardsMathias DietzSpectrum Scale - Release Lead Architect (4.2.X Release)System Health and Problem Determination Architect IBM Certified Software Engineer--IBM DeutschlandHechtsheimer Str. 255131 MainzPhone: +49-6131-84-2027Mobile: +49-15152801035E-Mail: mdi...@de.ibm.com--IBM Deutschland Research & Development GmbHVorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294From: Olaf Weiser/Germany/IBM@IBMDETo: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Date: 01/31/2017 11:47 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected or not. .So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM VerseSimon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
Has any headway been made on this issue? I just ran into it as well. The CES ip addresses just disappeared from my two protocol nodes (4.2.2.0). From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Thursday, February 2, 2017 at 12:02 PM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes pls contact me directly olaf.wei...@de.ibm.com Mit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform, --- IBM Deutschland IBM Allee 1 71139 Ehningen Phone: +49-170-579-44-66 E-Mail: olaf.wei...@de.ibm.com --- IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date:02/02/2017 06:45 PM Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org Any chance I can get that PMR# also, so I can reference it in my DDN case? ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Wednesday, February 1, 2017 at 2:28 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM Verse Mathias Dietz --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Mathias Dietz" <mdi...@de.ibm.com> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Mi. 01.02.2017 10:05 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes >I ll open a pmr here for my env ... the issue may hurt you inralf a ces env. >only... but needs to be fixed in core gpfs.base i think Thanks for opening the PMR. The problem is inside the gpfs base code and we are working on a fix right now. In the meantime until the fix is available we will use the PMR to propose/discuss potential work arounds. Mit freundlichen Grüßen / Kind regards Mathias Dietz Spectrum Scale - Release Lead Architect (4.2.X Release) System Health and Problem Determination Architect IBM Certified Software Engineer -- IBM Deutschland Hechtsheimer Str. 2 55131 Mainz Phone: +49-6131-84-2027 Mobile: +49-15152801035 E-Mail: mdi...@de.ibm.com -- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From:Olaf Weiser/Germany/IBM@IBMDE To:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Cc:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Date:01/31/2017 11:47 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org Yeah... depending on the #nodes you 're affected or not. . So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM Verse Simon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 21:07 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes. According to the docs, t
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
pls contact me directly olaf.wei...@de.ibm.comMit freundlichen Grüßen / Kind regards Olaf Weiser EMEA Storage Competence Center Mainz, German / IBM Systems, Storage Platform,---IBM DeutschlandIBM Allee 171139 EhningenPhone: +49-170-579-44-66E-Mail: olaf.wei...@de.ibm.com---IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin JetterGeschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert Janzen, Dr. Christian Keller, Ivo Koerner, Markus KoernerSitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940 From: Jonathon A Anderson <jonathon.ander...@colorado.edu>To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: 02/02/2017 06:45 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgAny chance I can get that PMR# also, so I can reference it in my DDN case? ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com>Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Date: Wednesday, February 1, 2017 at 2:28 AMTo: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org>Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Pmr opened... send the # directly to u Gesendet von IBM VerseMathias Dietz --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Mathias Dietz" <mdi...@de.ibm.com>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Mi. 01.02.2017 10:05Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes >I ll open a pmr here for my env ... the issue may hurt you inralf a ces env. only... but needs to be fixed in core gpfs.base i thinkThanks for opening the PMR.The problem is inside the gpfs base code and we are working on a fix right now.In the meantime until the fix is available we will use the PMR to propose/discuss potential work arounds.Mit freundlichen Grüßen / Kind regardsMathias DietzSpectrum Scale - Release Lead Architect (4.2.X Release)System Health and Problem Determination Architect IBM Certified Software Engineer--IBM DeutschlandHechtsheimer Str. 255131 MainzPhone: +49-6131-84-2027Mobile: +49-15152801035E-Mail: mdi...@de.ibm.com--IBM Deutschland Research & Development GmbHVorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294From: Olaf Weiser/Germany/IBM@IBMDETo: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Date: 01/31/2017 11:47 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgYeah... depending on the #nodes you 're affected or not. .So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM VerseSimon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von:"Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk>An:"gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org>Datum:Di. 31.01.2017 21:07Betreff:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes.According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken.SimonFrom: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu]Sent: 31 January 2017 17:47To: gpfsug main discussion listSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesYeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it’s only in CES. I suspect there just haven’t been that many people exporting CES out of an HPC cluster environment.~jonathonFrom: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.co
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
>I ll open a pmr here for my env ... the issue may hurt you inralf a ces env. only... but needs to be fixed in core gpfs.base i think Thanks for opening the PMR. The problem is inside the gpfs base code and we are working on a fix right now. In the meantime until the fix is available we will use the PMR to propose/discuss potential work arounds. Mit freundlichen Grüßen / Kind regards Mathias Dietz Spectrum Scale - Release Lead Architect (4.2.X Release) System Health and Problem Determination Architect IBM Certified Software Engineer -- IBM Deutschland Hechtsheimer Str. 2 55131 Mainz Phone: +49-6131-84-2027 Mobile: +49-15152801035 E-Mail: mdi...@de.ibm.com -- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From: Olaf Weiser/Germany/IBM@IBMDE To: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Cc: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Date: 01/31/2017 11:47 PM Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org Yeah... depending on the #nodes you 're affected or not. . So if your remote ces cluster is small enough in terms of the #nodes ... you'll neuer hit into this issue Gesendet von IBM Verse Simon Thompson (Research Computing - IT Services) --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Simon Thompson (Research Computing - IT Services)" <s.j.thomp...@bham.ac.uk> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 21:07 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes. According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu] Sent: 31 January 2017 17:47 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Yeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it?s only in CES. I suspect there just haven?t been that many people exporting CES out of an HPC cluster environment. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 10:45 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes I ll open a pmr here for my env ... the issue may hurt you in a ces env. only... but needs to be fixed in core gpfs.base i thi k Gesendet von IBM Verse Jonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Jonathon A Anderson" <jonathon.ander...@colorado.edu> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 17:32 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes No, I?m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don?t have ?protocol node? support, so they?ve pushed back on supporting this as an overall CES-rooted effort. I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR? Thanks. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 8:42 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ok.. so obviously ... it seems , that we have several issues.. the 3983 characters is obviously a defect have you already raised a PMR , if so , can you send me the number ? From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To: gpfsug main discus
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
Simon, This is what I’d usually do, and I’m pretty sure it’d fix the problem; but we only have two protocol nodes, so no good way to do quorum in a separate cluster of just those two. Plus, I’d just like to see the bug fixed. I suppose we could move the compute nodes to a separate cluster, and keep the protocol nodes together with the NSD servers; but then I’m back to the age-old question of “do I technically violate the GPFS license in order to do the right thing architecturally?” (Since you have to nominate GPFS servers in the client-only cluster to manage quorum, for nodes that only have client licenses.) So far, we’re 100% legit, and it’d be better to stay that way. ~jonathon On 1/31/17, 1:07 PM, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Simon Thompson (Research Computing - IT Services)" <gpfsug-discuss-boun...@spectrumscale.org on behalf of s.j.thomp...@bham.ac.uk> wrote: We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes. According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu] Sent: 31 January 2017 17:47 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Yeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it’s only in CES. I suspect there just haven’t been that many people exporting CES out of an HPC cluster environment. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 10:45 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes I ll open a pmr here for my env ... the issue may hurt you in a ces env. only... but needs to be fixed in core gpfs.base i thi k Gesendet von IBM Verse Jonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Jonathon A Anderson" <jonathon.ander...@colorado.edu> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 17:32 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes No, I’m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don’t have “protocol node” support, so they’ve pushed back on supporting this as an overall CES-rooted effort. I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR? Thanks. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 8:42 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ok.. so obviously ... it seems , that we have several issues.. the 3983 characters is obviously a defect have you already raised a PMR , if so , can you send me the number ? From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date:01/31/2017 04:14 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org The tail isn’t the issue; that’ my addition, so that I didn’t have to paste the hundred or so line nodelist into the thread. The actual command is tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile But you can see in my tailed output that the last hostname listed is cut-off halfway through the hostname. Less obvious in the example, but true, is the fact that it’s only showing the first 120 hosts, when we have 403 nodes in our gpfs cluster. [root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | wc -l 120 [root@sgate2 ~]# mmlscluster | grep '\-opa'
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
We use multicluster for our environment, storage systems in a separate cluster to hpc nodes on a separate cluster from protocol nodes. According to the docs, this isn't supported, but we haven't seen any issues. Note unsupported as opposed to broken. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Jonathon A Anderson [jonathon.ander...@colorado.edu] Sent: 31 January 2017 17:47 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Yeah, I searched around for places where ` tsctl shownodes up` appears in the GPFS code I have access to (i.e., the ksh and python stuff); but it’s only in CES. I suspect there just haven’t been that many people exporting CES out of an HPC cluster environment. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 10:45 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes I ll open a pmr here for my env ... the issue may hurt you in a ces env. only... but needs to be fixed in core gpfs.base i thi k Gesendet von IBM Verse Jonathon A Anderson --- Re: [gpfsug-discuss] CES doesn't assign addresses to nodes --- Von: "Jonathon A Anderson" <jonathon.ander...@colorado.edu> An: "gpfsug main discussion list" <gpfsug-discuss@spectrumscale.org> Datum: Di. 31.01.2017 17:32 Betreff: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes No, I’m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don’t have “protocol node” support, so they’ve pushed back on supporting this as an overall CES-rooted effort. I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR? Thanks. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 8:42 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ok.. so obviously ... it seems , that we have several issues.. the 3983 characters is obviously a defect have you already raised a PMR , if so , can you send me the number ? From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: 01/31/2017 04:14 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org The tail isn’t the issue; that’ my addition, so that I didn’t have to paste the hundred or so line nodelist into the thread. The actual command is tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile But you can see in my tailed output that the last hostname listed is cut-off halfway through the hostname. Less obvious in the example, but true, is the fact that it’s only showing the first 120 hosts, when we have 403 nodes in our gpfs cluster. [root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | wc -l 120 [root@sgate2 ~]# mmlscluster | grep '\-opa' | wc -l 403 Perhaps more explicitly, it looks like `tsctl shownodes up` can only transmit 3983 characters. [root@sgate2 ~]# tsctl shownodes up | wc -c 3983 Again, I’m convinced this is a bug not only because the command doesn’t actually produce a list of all of the up nodes in our cluster; but because the last name listed is incomplete. [root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | tail -n 1 shas0260-opa.rc.int.col[root@sgate2 ~]# I’d continue my investigation within tsctl itself but, alas, it’s a binary with no source code available to me. :) I’m trying to get this opened as a bug / PMR; but I’m still working through the DDN support infrastructure. Thanks for reporting it, though. For the record: [root@sgate2 ~]# rpm -qa | grep -i gpfs gpfs.base-4.2.1-2.x86_64 gpfs.msg.en_US-4.2.1-2.noarch gpfs.gplbin-3.10.0-327.el7.x86_64-4.2.1-0.x86_64 gpfs.gskit-8.0.50-57.x86_64 gpfs.gpl-4.2.1-2.noarch nfs-ganesha-gpfs-2.3.2-0.ibm24.el7.x86_64 gpfs.ext-4.2.1-2.x86_64 gpfs.gplbin-3.10.0-327.36.3.el7.x86_64-4.2.1-2.x86_64 gpfs.docs-4.2.1-2.noarch ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, Janua
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
No, I’m having trouble getting this through DDN support because, while we have a GPFS server license and GRIDScaler support, apparently we don’t have “protocol node” support, so they’ve pushed back on supporting this as an overall CES-rooted effort. I do have a DDN case open, though: 78804. If you are (as I suspect) a GPFS developer, do you mind if I cite your info from here in my DDN case to get them to open a PMR? Thanks. ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 8:42 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes ok.. so obviously ... it seems , that we have several issues.. the 3983 characters is obviously a defect have you already raised a PMR , if so , can you send me the number ? From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date:01/31/2017 04:14 PM Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org The tail isn’t the issue; that’ my addition, so that I didn’t have to paste the hundred or so line nodelist into the thread. The actual command is tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile But you can see in my tailed output that the last hostname listed is cut-off halfway through the hostname. Less obvious in the example, but true, is the fact that it’s only showing the first 120 hosts, when we have 403 nodes in our gpfs cluster. [root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | wc -l 120 [root@sgate2 ~]# mmlscluster | grep '\-opa' | wc -l 403 Perhaps more explicitly, it looks like `tsctl shownodes up` can only transmit 3983 characters. [root@sgate2 ~]# tsctl shownodes up | wc -c 3983 Again, I’m convinced this is a bug not only because the command doesn’t actually produce a list of all of the up nodes in our cluster; but because the last name listed is incomplete. [root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | tail -n 1 shas0260-opa.rc.int.col[root@sgate2 ~]# I’d continue my investigation within tsctl itself but, alas, it’s a binary with no source code available to me. :) I’m trying to get this opened as a bug / PMR; but I’m still working through the DDN support infrastructure. Thanks for reporting it, though. For the record: [root@sgate2 ~]# rpm -qa | grep -i gpfs gpfs.base-4.2.1-2.x86_64 gpfs.msg.en_US-4.2.1-2.noarch gpfs.gplbin-3.10.0-327.el7.x86_64-4.2.1-0.x86_64 gpfs.gskit-8.0.50-57.x86_64 gpfs.gpl-4.2.1-2.noarch nfs-ganesha-gpfs-2.3.2-0.ibm24.el7.x86_64 gpfs.ext-4.2.1-2.x86_64 gpfs.gplbin-3.10.0-327.36.3.el7.x86_64-4.2.1-2.x86_64 gpfs.docs-4.2.1-2.noarch ~jonathon From: <gpfsug-discuss-boun...@spectrumscale.org> on behalf of Olaf Weiser <olaf.wei...@de.ibm.com> Reply-To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Date: Tuesday, January 31, 2017 at 1:30 AM To: gpfsug main discussion list <gpfsug-discuss@spectrumscale.org> Subject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Hi ...same thing here.. everything after 10 nodes will be truncated.. though I don't have an issue with it ... I 'll open a PMR .. and I recommend you to do the same thing.. ;-) the reason seems simple.. it is the "| tail" .at the end of the command.. .. which truncates the output to the last 10 items... should be easy to fix.. cheers olaf From:Jonathon A Anderson <jonathon.ander...@colorado.edu> To:"gpfsug-discuss@spectrumscale.org" <gpfsug-discuss@spectrumscale.org> Date: 01/30/2017 11:11 PM Subject:Re: [gpfsug-discuss] CES doesn't assign addresses to nodes Sent by:gpfsug-discuss-boun...@spectrumscale.org In trying to figure this out on my own, I’m relatively certain I’ve found a bug in GPFS related to the truncation of output from `tsctl shownodes up`. Any chance someone in development can confirm? Here are the details of my investigation: ## GPFS is up on sgate2 [root@sgate2 ~]# mmgetstate Node number Node nameGPFS state -- 414 sgate2-opa active ## but if I tell ces to explicitly put one of our ces addresses on that node, it says that GPFS is down [root@sgate2 ~]# mmces address move --ces-ip 10.225.71.102 --ces-node sgate2-opa mmces address move: GPFS is down on this node. mmces address move: Command failed. Examine previous error messages to determine cause. ## the “GPFS is down on this node” message is defined as code 109 in mmglobfuncs [root@sgate2 ~]# grep --before-context=1 "GPFS is d
Re: [gpfsug-discuss] CES doesn't assign addresses to nodes
Hi ...same thing here.. everything after 10 nodes will be truncated.. though I don't have an issue with it ... I 'll open a PMR .. and I recommend you to do the same thing.. ;-) the reason seems simple.. it is the "| tail" .at the end of the command.. .. which truncates the output to the last 10 items... should be easy to fix.. cheersolafFrom: Jonathon A AndersonTo: "gpfsug-discuss@spectrumscale.org" Date: 01/30/2017 11:11 PMSubject: Re: [gpfsug-discuss] CES doesn't assign addresses to nodesSent by: gpfsug-discuss-boun...@spectrumscale.orgIn trying to figure this out on my own, I’m relatively certain I’ve found a bug in GPFS related to the truncation of output from `tsctl shownodes up`. Any chance someone in development can confirm?Here are the details of my investigation:## GPFS is up on sgate2[root@sgate2 ~]# mmgetstate Node number Node name GPFS state -- 414 sgate2-opa active## but if I tell ces to explicitly put one of our ces addresses on that node, it says that GPFS is down[root@sgate2 ~]# mmces address move --ces-ip 10.225.71.102 --ces-node sgate2-opammces address move: GPFS is down on this node.mmces address move: Command failed. Examine previous error messages to determine cause.## the “GPFS is down on this node” message is defined as code 109 in mmglobfuncs[root@sgate2 ~]# grep --before-context=1 "GPFS is down on this node." /usr/lpp/mmfs/bin/mmglobfuncs 109 ) msgTxt=\"%s: GPFS is down on this node."## and is generated by printErrorMsg in mmcesnetmvaddress when it detects that the current node is identified as “down” by getDownCesNodeList[root@sgate2 ~]# grep --before-context=5 'printErrorMsg 109' /usr/lpp/mmfs/bin/mmcesnetmvaddress downNodeList=$(getDownCesNodeList) for downNode in $downNodeList do if [[ $toNodeName == $downNode ]] then printErrorMsg 109 "$mmcmd"## getDownCesNodeList is the intersection of all ces nodes with GPFS cluster nodes listed in `tsctl shownodes up`[root@sgate2 ~]# grep --after-context=16 '^function getDownCesNodeList' /usr/lpp/mmfs/bin/mmcesfuncsfunction getDownCesNodeList{ typeset sourceFile="mmcesfuncs.sh" [[ -n $DEBUG || -n $DEBUGgetDownCesNodeList ]] && set -x $mmTRACE_ENTER "$*" typeset upnodefile=${cmdTmpDir}upnodefile typeset downNodeList # get all CES nodes $sort -o $nodefile $mmfsCesNodes.dae $tsctl shownodes up | $tr ',' '\n' | $sort -o $upnodefile downNodeList=$($comm -23 $nodefile $upnodefile) print -- $downNodeList} #- end of function getDownCesNodeList ## but not only are the sgate nodes not listed by `tsctl shownodes up`; its output is obviously and erroneously truncated[root@sgate2 ~]# tsctl shownodes up | tr ',' '\n' | tailshas0251-opa.rc.int.colorado.edushas0252-opa.rc.int.colorado.edushas0253-opa.rc.int.colorado.edushas0254-opa.rc.int.colorado.edushas0255-opa.rc.int.colorado.edushas0256-opa.rc.int.colorado.edushas0257-opa.rc.int.colorado.edushas0258-opa.rc.int.colorado.edushas0259-opa.rc.int.colorado.edushas0260-opa.rc.int.col[root@sgate2 ~]### I expect that this is a bug in GPFS, likely related to a maximum output buffer for `tsctl shownodes up`.On 1/24/17, 12:48 PM, "Jonathon A Anderson" wrote: I think I'm having the same issue described here: http://www.spectrumscale.org/pipermail/gpfsug-discuss/2016-October/002288.html Any advice or further troubleshooting steps would be much appreciated. Full disclosure: I also have a DDN case open. (78804) We've got a four-node (snsd{1..4}) DDN gridscaler system. I'm trying to add two CES protocol nodes (sgate{1,2}) to serve NFS. Here's the steps I took: --- mmcrnodeclass protocol -N sgate1-opa,sgate2-opa mmcrnodeclass nfs -N sgate1-opa,sgate2-opa mmchconfig cesSharedRoot=/gpfs/summit/ces mmchcluster --ccr-enable mmchnode --ces-enable -N protocol mmces service enable NFS mmces service start NFS -N nfs mmces address add --ces-ip 10.225.71.104,10.225.71.105 mmces address policy even-coverage mmces address move --rebalance --- This worked the very first time I ran it, but the CES addresses weren't re-distributed after restarting GPFS or a node reboot. Things I've tried: * disabling ces on the sgate nodes and re-running the above procedure * moving the cluster and filesystem managers to different snsd nodes * deleting and re-creating the cesSharedRoot directory Meanwhile, the following log entry appears in mmfs.log.latest every ~30s: --- Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found unassigned address 10.225.71.104 Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found unassigned address 10.225.71.105 Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: handleNetworkProblem with lock held: assignIP
[gpfsug-discuss] CES doesn't assign addresses to nodes
I think I'm having the same issue described here: http://www.spectrumscale.org/pipermail/gpfsug-discuss/2016-October/002288.html Any advice or further troubleshooting steps would be much appreciated. Full disclosure: I also have a DDN case open. (78804) We've got a four-node (snsd{1..4}) DDN gridscaler system. I'm trying to add two CES protocol nodes (sgate{1,2}) to serve NFS. Here's the steps I took: --- mmcrnodeclass protocol -N sgate1-opa,sgate2-opa mmcrnodeclass nfs -N sgate1-opa,sgate2-opa mmchconfig cesSharedRoot=/gpfs/summit/ces mmchcluster --ccr-enable mmchnode --ces-enable -N protocol mmces service enable NFS mmces service start NFS -N nfs mmces address add --ces-ip 10.225.71.104,10.225.71.105 mmces address policy even-coverage mmces address move --rebalance --- This worked the very first time I ran it, but the CES addresses weren't re-distributed after restarting GPFS or a node reboot. Things I've tried: * disabling ces on the sgate nodes and re-running the above procedure * moving the cluster and filesystem managers to different snsd nodes * deleting and re-creating the cesSharedRoot directory Meanwhile, the following log entry appears in mmfs.log.latest every ~30s: --- Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found unassigned address 10.225.71.104 Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Found unassigned address 10.225.71.105 Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: handleNetworkProblem with lock held: assignIP 10.225.71.104_0-_+,10.225.71.105_0-_+ 1 Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: Assigning addresses: 10.225.71.104_0-_+,10.225.71.105_0-_+ Mon Jan 23 20:31:20 MST 2017: mmcesnetworkmonitor: moveCesIPs: 10.225.71.104_0-_+,10.225.71.105_0-_+ --- Also notable, whenever I add or remove addresses now, I see this in mmsysmonitor.log (among a lot of other entries): --- 2017-01-23T20:40:56.363 sgate1 D ET_cesnetwork Entity state without requireUnique: ces_network_ips_down WARNING No CES relevant NICs detected - Service.calculateAndUpdateState:275 2017-01-23T20:40:11.364 sgate1 D ET_cesnetwork Update multiple entities at once {'p2p2': 1, 'bond0': 1, 'p2p1': 1} - Service.setLocalState:333 --- For the record, here's the interface I expect to get the address on sgate1: --- 11: bond0:mtu 9000 qdisc noqueue state UP link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff inet 10.225.71.107/20 brd 10.225.79.255 scope global bond0 valid_lft forever preferred_lft forever inet6 fe80::3efd:feff:fe08:a7c0/64 scope link valid_lft forever preferred_lft forever --- which is a bond of p2p1 and p2p2. --- 6: p2p1: mtu 9000 qdisc mq master bond0 state UP qlen 1000 link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff 7: p2p2: mtu 9000 qdisc mq master bond0 state UP qlen 1000 link/ether 3c:fd:fe:08:a7:c0 brd ff:ff:ff:ff:ff:ff --- A similar bond0 exists on sgate2. I crawled around in /usr/lpp/mmfs/lib/mmsysmon/CESNetworkService.py for a while trying to figure it out, but have been unsuccessful so far. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss