Re: [gpfsug-discuss] tsgskkm stuck
On 8/28/20 11:43 AM, Philipp Helo Rehs wrote: root 38212 100 0.0 35544 5752 ? R 11:32 9:40 /usr/lpp/mmfs/bin/tsgskkm store --cert /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips off Judging from the command line tsgskkm will generate a certificate which normally involves a random number generator. If such a process hangs it might be due to a lack of entropy. So I suggest trying to generate some I/O on the node. Or run something like haveged (https://wiki.archlinux.org/index.php/Haveged). Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
Of course, you might also be interested in our upcoming Webinar on 22nd September (which I haven't advertised yet): https://www.spectrumscaleug.org/event/ssugdigital-deep-dive-in-spectrum-scale-core/ ... This presentation will discuss selected improvements in Spectrum V5, focusing on improvements for inode management, VCPU scaling and considerations for NUMA. Simon On 04/09/2020, 08:56, "gpfsug-discuss-boun...@spectrumscale.org on behalf of Jonathan Buzzard" wrote: On 02/09/2020 23:28, Andrew Beattie wrote: > Giovanni, I have clients in Australia that are running AMD ROME > processors in their Visualisation nodes connected to scale 5.0.4 > clusters with no issues. Spectrum Scale doesn't differentiate between > x86 processor technologies -- it only looks at x86_64 (OS support > more than anything else) While true bear in mind their are limits on the number of cores that it might be quite easy to pass on a high end multi CPU AMD machine :-) See question 5.3 https://www.ibm.com/support/knowledgecenter/STXKQY/gpfsclustersfaq.pdf 192 is the largest tested limit for the number of cores and there is a hard limit at 1536 cores. From memory these limits are lower in older versions of GPFS.So I think the "tested" limit in 4.2 is 64 cores from memory (or was at the time of release), but works just fine on 80 cores as far as I can tell. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
On 02/09/2020 23:28, Andrew Beattie wrote: Giovanni, I have clients in Australia that are running AMD ROME processors in their Visualisation nodes connected to scale 5.0.4 clusters with no issues. Spectrum Scale doesn't differentiate between x86 processor technologies -- it only looks at x86_64 (OS support more than anything else) While true bear in mind their are limits on the number of cores that it might be quite easy to pass on a high end multi CPU AMD machine :-) See question 5.3 https://www.ibm.com/support/knowledgecenter/STXKQY/gpfsclustersfaq.pdf 192 is the largest tested limit for the number of cores and there is a hard limit at 1536 cores. From memory these limits are lower in older versions of GPFS.So I think the "tested" limit in 4.2 is 64 cores from memory (or was at the time of release), but works just fine on 80 cores as far as I can tell. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
I don’t currently have any x86 based servers to do that kind of performance testing, But the PCI-Gen 4 advantages alone mean that the AMD server options have significant benefits over current Intel processor platforms. There are however limited storage controllers and Network adapters that can help utilise the full benefits of PCI-gen4. In terms of NSD architecture there are many variables that you also have to take into consideration. Are you looking at storage rich servers? Are you looking at SAN attached Flash Are you looking at scale ECE type deployment? As an IBM employee and someone familiar with ESS 5000, and the differences / benefits of the 5K architecture, Unless your planning on building a Scale ECE type cluster with AMD processors, storage class memory, and NVMe flash modules. I would seriously consider the ESS 5k over an x86 based NL-SAS storage topology Including AMD. Sent from my iPhone > On 3 Sep 2020, at 17:44, Giovanni Bracco wrote: > > OK from client side, but I would like to know if the same is also for > NSD servers with AMD EPYC, do they operate with good performance > compared to Intel CPUs? > > Giovanni > >> On 03/09/20 00:28, Andrew Beattie wrote: >> Giovanni, >> I have clients in Australia that are running AMD ROME processors in >> their Visualisation nodes connected to scale 5.0.4 clusters with no issues. >> Spectrum Scale doesn't differentiate between x86 processor technologies >> -- it only looks at x86_64 (OS support more than anything else) >> Andrew Beattie >> File and Object Storage Technical Specialist - A/NZ >> IBM Systems - Storage >> Phone: 614-2133-7927 >> E-mail: abeat...@au1.ibm.com <mailto:abeat...@au1.ibm.com> >> >>- Original message - >>From: Giovanni Bracco >>Sent by: gpfsug-discuss-boun...@spectrumscale.org >>To: gpfsug main discussion list , >>Frederick Stock >>Cc: >>Subject: [EXTERNAL] Re: [gpfsug-discuss] tsgskkm stuck---> what >>about AMD epyc support in GPFS? >>Date: Thu, Sep 3, 2020 7:29 AM >>I am curious to know about AMD epyc support by GPFS: what is the status? >>Giovanni Bracco >> >>>On 28/08/20 14:25, Frederick Stock wrote: >>> Not sure that Spectrum Scale has stated it supports the AMD epyc >>(Rome?) >>> processors. You may want to open a help case to determine the >>cause of >>> this problem. >>> Note that Spectrum Scale 4.2.x goes out of service on September >>30, 2020 >>> so you may want to consider upgrading your cluster. And should Scale >>> officially support the AMD epyc processor it would not be on >>Scale 4.2.x. >>> >>> Fred >>> __ >>> Fred Stock | IBM Pittsburgh Lab | 720-430-8821 >>> sto...@us.ibm.com >>> >>> - Original message - >>> From: Philipp Helo Rehs >>> Sent by: gpfsug-discuss-boun...@spectrumscale.org >>> To: gpfsug main discussion list >> >>> Cc: >>> Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuck >>> Date: Fri, Aug 28, 2020 5:52 AM >>> Hello, >>> >>> we have a gpfs v4 cluster running with 4 nsds and i am trying >>to add >>> some clients: >>> >>> mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1 >>> >>> this commands hangs and do not finish >>> >>> When i look into the server, i can see the following >>processes which >>> never finish: >>> >>> root 38138 0.0 0.0 123048 10376 ?Ss 11:32 0:00 >>> /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote >>checkNewClusterNode3 >>> lc/setupClient >>> >> %%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0: >>> %%home%%:20_MEMBER_NODE::5:20:hpc-storage-1 >>> root 38169 0.0 0.0 123564 10892 ?S11:32 0:00 >>> /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl >>setupClient 2 >>> 21479 >>> >> 1=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:1191 >>> 0 1191 >>> root 38212 100 0.0 35544 5752 ?R11:32 9:40 >>> /usr/lpp/mmfs/bin
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
OK from client side, but I would like to know if the same is also for NSD servers with AMD EPYC, do they operate with good performance compared to Intel CPUs? Giovanni On 03/09/20 00:28, Andrew Beattie wrote: Giovanni, I have clients in Australia that are running AMD ROME processors in their Visualisation nodes connected to scale 5.0.4 clusters with no issues. Spectrum Scale doesn't differentiate between x86 processor technologies -- it only looks at x86_64 (OS support more than anything else) Andrew Beattie File and Object Storage Technical Specialist - A/NZ IBM Systems - Storage Phone: 614-2133-7927 E-mail: abeat...@au1.ibm.com <mailto:abeat...@au1.ibm.com> - Original message - From: Giovanni Bracco Sent by: gpfsug-discuss-boun...@spectrumscale.org To: gpfsug main discussion list , Frederick Stock Cc: Subject: [EXTERNAL] Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS? Date: Thu, Sep 3, 2020 7:29 AM I am curious to know about AMD epyc support by GPFS: what is the status? Giovanni Bracco On 28/08/20 14:25, Frederick Stock wrote: > Not sure that Spectrum Scale has stated it supports the AMD epyc (Rome?) > processors. You may want to open a help case to determine the cause of > this problem. > Note that Spectrum Scale 4.2.x goes out of service on September 30, 2020 > so you may want to consider upgrading your cluster. And should Scale > officially support the AMD epyc processor it would not be on Scale 4.2.x. > > Fred > __ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > sto...@us.ibm.com > > - Original message - > From: Philipp Helo Rehs > Sent by: gpfsug-discuss-boun...@spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuck > Date: Fri, Aug 28, 2020 5:52 AM > Hello, > > we have a gpfs v4 cluster running with 4 nsds and i am trying to add > some clients: > > mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1 > > this commands hangs and do not finish > > When i look into the server, i can see the following processes which > never finish: > > root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00 > /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3 > lc/setupClient > %%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0: > %%home%%:20_MEMBER_NODE::5:20:hpc-storage-1 > root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00 > /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2 > 21479 > 1=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:1191 > 0 1191 > root 38212 100 0.0 35544 5752 ? R 11:32 9:40 > /usr/lpp/mmfs/bin/tsgskkm store --cert > /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv > /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out > /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips off > > The node is an AMD epyc. > > Any idea what could cause the issue? > > ssh is possible in both directions and firewall is disabled. > > > Kind regards > > Philipp Rehs > > > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -- Giovanni Bracco phone +39 351 8804788 E-mail giovanni.bra...@enea.it WWW http://www.afs.enea.it/bracco ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/lis
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
Giovanni, I have clients in Australia that are running AMD ROME processors in their Visualisation nodes connected to scale 5.0.4 clusters with no issues. Spectrum Scale doesn't differentiate between x86 processor technologies -- it only looks at x86_64 (OS support more than anything else) Andrew Beattie File and Object Storage Technical Specialist - A/NZ IBM Systems - Storage Phone: 614-2133-7927 E-mail: abeat...@au1.ibm.com - Original message -From: Giovanni Bracco Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list , Frederick Stock Cc:Subject: [EXTERNAL] Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?Date: Thu, Sep 3, 2020 7:29 AM I am curious to know about AMD epyc support by GPFS: what is the status?Giovanni BraccoOn 28/08/20 14:25, Frederick Stock wrote:> Not sure that Spectrum Scale has stated it supports the AMD epyc (Rome?)> processors. You may want to open a help case to determine the cause of> this problem.> Note that Spectrum Scale 4.2.x goes out of service on September 30, 2020> so you may want to consider upgrading your cluster. And should Scale> officially support the AMD epyc processor it would not be on Scale 4.2.x.>> Fred> __> Fred Stock | IBM Pittsburgh Lab | 720-430-8821> sto...@us.ibm.com>> - Original message -> From: Philipp Helo Rehs > Sent by: gpfsug-discuss-boun...@spectrumscale.org> To: gpfsug main discussion list > Cc:> Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuck> Date: Fri, Aug 28, 2020 5:52 AM> Hello,>> we have a gpfs v4 cluster running with 4 nsds and i am trying to add> some clients:>> mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1>> this commands hangs and do not finish>> When i look into the server, i can see the following processes which> never finish:>> root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00> /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3> lc/setupClient> %%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0:> %%home%%:20_MEMBER_NODE::5:20:hpc-storage-1> root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00> /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2> 21479> 1=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:1191> 0 1191> root 38212 100 0.0 35544 5752 ? R 11:32 9:40> /usr/lpp/mmfs/bin/tsgskkm store --cert> /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv> /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out> /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips off>> The node is an AMD epyc.>> Any idea what could cause the issue?>> ssh is possible in both directions and firewall is disabled.>>> Kind regards>> Philipp Rehs>>> ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> ___> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >--Giovanni Braccophone +39 351 8804788E-mail giovanni.bra...@enea.itWWW http://www.afs.enea.it/bracco ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck---> what about AMD epyc support in GPFS?
I am curious to know about AMD epyc support by GPFS: what is the status? Giovanni Bracco On 28/08/20 14:25, Frederick Stock wrote: Not sure that Spectrum Scale has stated it supports the AMD epyc (Rome?) processors. You may want to open a help case to determine the cause of this problem. Note that Spectrum Scale 4.2.x goes out of service on September 30, 2020 so you may want to consider upgrading your cluster. And should Scale officially support the AMD epyc processor it would not be on Scale 4.2.x. Fred __ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 sto...@us.ibm.com - Original message - From: Philipp Helo Rehs Sent by: gpfsug-discuss-boun...@spectrumscale.org To: gpfsug main discussion list Cc: Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuck Date: Fri, Aug 28, 2020 5:52 AM Hello, we have a gpfs v4 cluster running with 4 nsds and i am trying to add some clients: mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1 this commands hangs and do not finish When i look into the server, i can see the following processes which never finish: root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3 lc/setupClient %%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0: %%home%%:20_MEMBER_NODE::5:20:hpc-storage-1 root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2 21479 1=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:1191 0 1191 root 38212 100 0.0 35544 5752 ? R 11:32 9:40 /usr/lpp/mmfs/bin/tsgskkm store --cert /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips off The node is an AMD epyc. Any idea what could cause the issue? ssh is possible in both directions and firewall is disabled. Kind regards Philipp Rehs ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- Giovanni Bracco phone +39 351 8804788 E-mail giovanni.bra...@enea.it WWW http://www.afs.enea.it/bracco ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck
Hallo Philipp, seems, your nodes can not clearly communicate ?!? .. can you check , that gpfs.gskit is at the same level ..if not, pls update to the same level I've seen similar behavior , when reverse lookup of host names / wrong entries in /etc/hosts ... is breaking you setup .. if DNS and gskit is correct... please open a PMR - Original message -From: Philipp Helo Rehs Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuckDate: Fri, Aug 28, 2020 11:52 AM Hello,we have a gpfs v4 cluster running with 4 nsds and i am trying to addsome clients:mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1this commands hangs and do not finishWhen i look into the server, i can see the following processes whichnever finish:root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3lc/setupClient%%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0:%%home%%:20_MEMBER_NODE::5:20:hpc-storage-1root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2214791=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:11910 1191root 38212 100 0.0 35544 5752 ? R 11:32 9:40/usr/lpp/mmfs/bin/tsgskkm store --cert/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips offThe node is an AMD epyc.Any idea what could cause the issue?ssh is possible in both directions and firewall is disabled.Kind regards Philipp Rehs ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] tsgskkm stuck
Not sure that Spectrum Scale has stated it supports the AMD epyc (Rome?) processors. You may want to open a help case to determine the cause of this problem. Note that Spectrum Scale 4.2.x goes out of service on September 30, 2020 so you may want to consider upgrading your cluster. And should Scale officially support the AMD epyc processor it would not be on Scale 4.2.x. Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com - Original message -From: Philipp Helo Rehs Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug main discussion list Cc:Subject: [EXTERNAL] [gpfsug-discuss] tsgskkm stuckDate: Fri, Aug 28, 2020 5:52 AM Hello,we have a gpfs v4 cluster running with 4 nsds and i am trying to addsome clients:mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1this commands hangs and do not finishWhen i look into the server, i can see the following processes whichnever finish:root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3lc/setupClient%%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0:%%home%%:20_MEMBER_NODE::5:20:hpc-storage-1root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2214791=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:11910 1191root 38212 100 0.0 35544 5752 ? R 11:32 9:40/usr/lpp/mmfs/bin/tsgskkm store --cert/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out/var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips offThe node is an AMD epyc.Any idea what could cause the issue?ssh is possible in both directions and firewall is disabled.Kind regards Philipp Rehs ___gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] tsgskkm stuck
Hello, we have a gpfs v4 cluster running with 4 nsds and i am trying to add some clients: mmaddnode -N hpc-storage-1-ib:client:hpc-storage-1 this commands hangs and do not finish When i look into the server, i can see the following processes which never finish: root 38138 0.0 0.0 123048 10376 ? Ss 11:32 0:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote checkNewClusterNode3 lc/setupClient %%%%:00_VERSION_LINE::1709:3:1::lc:gpfs3.hilbert.hpc.uni-duesseldorf.de::0:/bin/ssh:/bin/scp:5362040003754711198:lc2:1597757602::HPCStorage.hilbert.hpc.uni-duesseldorf.de:2:1:1:2:A:::central:0.0: %%home%%:20_MEMBER_NODE::5:20:hpc-storage-1 root 38169 0.0 0.0 123564 10892 ? S 11:32 0:00 /usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmremote ccrctl setupClient 2 21479 1=gpfs3-ib.hilbert.hpc.uni-duesseldorf.de:1191,2=gpfs4-ib.hilbert.hpc.uni-duesseldorf.de:1191,4=gpfs6-ib.hilbert.hpc.uni-duesseldorf.de:1191,3=gpfs5-ib.hilbert.hpc.uni-duesseldorf.de:1191 0 1191 root 38212 100 0.0 35544 5752 ? R 11:32 9:40 /usr/lpp/mmfs/bin/tsgskkm store --cert /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.cert --priv /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.priv --out /var/mmfs/ssl/stage/tmpKeyData.mmremote.38169.keystore --fips off The node is an AMD epyc. Any idea what could cause the issue? ssh is possible in both directions and firewall is disabled. Kind regards Philipp Rehs smime.p7s Description: S/MIME Cryptographic Signature ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss