And there you have: [ems1-fdr,compute,gss_ppc64] verbsRdmaSend yes
Try turning this off. -jf ons. 11. jan. 2017 kl. 18.54 skrev Damir Krstic <[email protected]>: > Thanks for all the suggestions. Here is our mmlsconfig file. We just > purchased another GL6. During the installation of the new GL6 IBM will > upgrade our existing GL6 up to the latest code levels. This will happen > during the week of 23rd of Jan. > > I am skeptical that the upgrade is going to fix the issue. > > On our IO servers we are running in connected mode (please note that IB > interfaces are bonded) > > [root@gssio1 ~]# cat /sys/class/net/ib0/mode > > connected > > [root@gssio1 ~]# cat /sys/class/net/ib1/mode > > connected > > [root@gssio1 ~]# cat /sys/class/net/ib2/mode > > connected > > [root@gssio1 ~]# cat /sys/class/net/ib3/mode > > connected > > [root@gssio2 ~]# cat /sys/class/net/ib0/mode > > connected > > [root@gssio2 ~]# cat /sys/class/net/ib1/mode > > connected > > [root@gssio2 ~]# cat /sys/class/net/ib2/mode > > connected > > [root@gssio2 ~]# cat /sys/class/net/ib3/mode > > connected > > Our login nodes are also running connected mode as well. > > However, all of our compute nodes are running in datagram: > > [root@mgt ~]# psh compute cat /sys/class/net/ib0/mode > > qnode0758: datagram > > qnode0763: datagram > > qnode0760: datagram > > qnode0772: datagram > > qnode0773: datagram > ....etc. > > Here is our mmlsconfig: > > [root@gssio1 ~]# mmlsconfig > > Configuration data for cluster ess-qstorage.it.northwestern.edu: > > ---------------------------------------------------------------- > > clusterName ess-qstorage.it.northwestern.edu > > clusterId 17746506346828356609 > > dmapiFileHandleSize 32 > > minReleaseLevel 4.2.0.1 > > ccrEnabled yes > > cipherList AUTHONLY > > [gss_ppc64] > > nsdRAIDBufferPoolSizePct 80 > > maxBufferDescs 2m > > prefetchPct 5 > > nsdRAIDTracks 128k > > nsdRAIDSmallBufferSize 256k > > nsdMaxWorkerThreads 3k > > nsdMinWorkerThreads 3k > > nsdRAIDSmallThreadRatio 2 > > nsdRAIDThreadsPerQueue 16 > > nsdRAIDEventLogToConsole all > > nsdRAIDFastWriteFSDataLimit 256k > > nsdRAIDFastWriteFSMetadataLimit 1M > > nsdRAIDReconstructAggressiveness 1 > > nsdRAIDFlusherBuffersLowWatermarkPct 20 > > nsdRAIDFlusherBuffersLimitPct 80 > > nsdRAIDFlusherTracksLowWatermarkPct 20 > > nsdRAIDFlusherTracksLimitPct 80 > > nsdRAIDFlusherFWLogHighWatermarkMB 1000 > > nsdRAIDFlusherFWLogLimitMB 5000 > > nsdRAIDFlusherThreadsLowWatermark 1 > > nsdRAIDFlusherThreadsHighWatermark 512 > > nsdRAIDBlockDeviceMaxSectorsKB 8192 > > nsdRAIDBlockDeviceNrRequests 32 > > nsdRAIDBlockDeviceQueueDepth 16 > > nsdRAIDBlockDeviceScheduler deadline > > nsdRAIDMaxTransientStale2FT 1 > > nsdRAIDMaxTransientStale3FT 1 > > nsdMultiQueue 512 > > syncWorkerThreads 256 > > nsdInlineWriteMax 32k > > maxGeneralThreads 1280 > > maxReceiverThreads 128 > > nspdQueues 64 > > [common] > > maxblocksize 16m > > [ems1-fdr,compute,gss_ppc64] > > numaMemoryInterleave yes > > [gss_ppc64] > > maxFilesToCache 12k > > [ems1-fdr,compute] > > maxFilesToCache 128k > > [ems1-fdr,compute,gss_ppc64] > > flushedDataTarget 1024 > > flushedInodeTarget 1024 > > maxFileCleaners 1024 > > maxBufferCleaners 1024 > > logBufferCount 20 > > logWrapAmountPct 2 > > logWrapThreads 128 > > maxAllocRegionsPerNode 32 > > maxBackgroundDeletionThreads 16 > > maxInodeDeallocPrefetch 128 > > [gss_ppc64] > > maxMBpS 16000 > > [ems1-fdr,compute] > > maxMBpS 10000 > > [ems1-fdr,compute,gss_ppc64] > > worker1Threads 1024 > > worker3Threads 32 > > [gss_ppc64] > > ioHistorySize 64k > > [ems1-fdr,compute] > > ioHistorySize 4k > > [gss_ppc64] > > verbsRdmaMinBytes 16k > > [ems1-fdr,compute] > > verbsRdmaMinBytes 32k > > [ems1-fdr,compute,gss_ppc64] > > verbsRdmaSend yes > > [gss_ppc64] > > verbsRdmasPerConnection 16 > > [ems1-fdr,compute] > > verbsRdmasPerConnection 256 > > [gss_ppc64] > > verbsRdmasPerNode 3200 > > [ems1-fdr,compute] > > verbsRdmasPerNode 1024 > > [ems1-fdr,compute,gss_ppc64] > > verbsSendBufferMemoryMB 1024 > > verbsRdmasPerNodeOptimize yes > > verbsRdmaUseMultiCqThreads yes > > [ems1-fdr,compute] > > ignorePrefetchLUNCount yes > > [gss_ppc64] > > scatterBufferSize 256K > > [ems1-fdr,compute] > > scatterBufferSize 256k > > syncIntervalStrict yes > > [ems1-fdr,compute,gss_ppc64] > > nsdClientCksumTypeLocal ck64 > > nsdClientCksumTypeRemote ck64 > > [gss_ppc64] > > pagepool 72856M > > [ems1-fdr] > > pagepool 17544M > > [compute] > > pagepool 4g > > [ems1-fdr,qsched03-ib0,quser10-fdr,compute,gss_ppc64] > > verbsRdma enable > > [gss_ppc64] > > verbsPorts mlx5_0/1 mlx5_0/2 mlx5_1/1 mlx5_1/2 > > [ems1-fdr] > > verbsPorts mlx5_0/1 mlx5_0/2 > > [qsched03-ib0,quser10-fdr,compute] > > verbsPorts mlx4_0/1 > > [common] > > autoload no > > [ems1-fdr,compute,gss_ppc64] > > maxStatCache 0 > > [common] > > envVar MLX4_USE_MUTEX=1 MLX5_SHUT_UP_BF=1 MLX5_USE_MUTEX=1 > > deadlockOverloadThreshold 0 > > deadlockDetectionThreshold 0 > > adminMode central > > > File systems in cluster ess-qstorage.it.northwestern.edu: > > --------------------------------------------------------- > > /dev/home > > /dev/hpc > > /dev/projects > > /dev/tthome > > On Wed, Jan 11, 2017 at 9:16 AM Luis Bolinches <[email protected]> > wrote: > > In addition to what Olaf has said > > ESS upgrades include mellanox modules upgrades in the ESS nodes. In fact, > on those noes you should do not update those solo (unless support says so > in your PMR), so if that's been the recommendation, I suggest you look at > it. > > Changelog on ESS 4.0.4 (no idea what ESS level you are running) > > > c) Support of MLNX_OFED_LINUX-3.2-2.0.0.1 > - Updated from MLNX_OFED_LINUX-3.1-1.0.6.1 (ESS 4.0, 4.0.1, 4.0.2) > - Updated from MLNX_OFED_LINUX-3.1-1.0.0.2 (ESS 3.5.x) > - Updated from MLNX_OFED_LINUX-2.4-1.0.2 (ESS 3.0.x) > - Support for PCIe3 LP 2-port 100 Gb EDR InfiniBand adapter x16 (FC EC3E) > - Requires System FW level FW840.20 (SV840_104) > - No changes from ESS 4.0.3 > > > -- > Ystävällisin terveisin / Kind regards / Saludos cordiales / Salutations > > Luis Bolinches > Lab Services > http://www-03.ibm.com/systems/services/labservices/ > > IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland > Phone: +358 503112585 <+358%2050%203112585> > > "If you continually give you will continually have." Anonymous > > > > ----- Original message ----- > From: "Olaf Weiser" <[email protected]> > Sent by: [email protected] > To: gpfsug main discussion list <[email protected]> > > Cc: > Subject: Re: [gpfsug-discuss] nodes being ejected out of the cluster > Date: Wed, Jan 11, 2017 5:03 PM > > most likely, there's smth wrong with your IB fabric ... > you say, you run ~ 700 nodes ? ... > Are you running with *verbsRdmaSend*enabled ? ,if so, please consider to > disable - and discuss this within the PMR > another issue, you may check is - Are you running the IPoIB in connected > mode or datagram ... but as I said, please discuss this within the PMR .. > there are to much dependencies to discuss this here .. > > > cheers > > > Mit freundlichen Grüßen / Kind regards > > > Olaf Weiser > > EMEA Storage Competence Center Mainz, German / IBM Systems, Storage > Platform, > > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland > IBM Allee 1 > 71139 Ehningen > Phone: +49-170-579-44-66 <+49%20170%205794466> > E-Mail: [email protected] > > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland GmbH / Vorsitzender des Aufsichtsrats: Martin Jetter > Geschäftsführung: Martina Koederitz (Vorsitzende), Susanne Peter, Norbert > Janzen, Dr. Christian Keller, Ivo Koerner, Markus Koerner > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 14562 / WEEE-Reg.-Nr. DE 99369940 > > > > From: Damir Krstic <[email protected]> > To: gpfsug main discussion list <[email protected]> > Date: 01/11/2017 03:39 PM > Subject: [gpfsug-discuss] nodes being ejected out of the cluster > Sent by: [email protected] > ------------------------------ > > > > We are running GPFS 4.2 on our cluster (around 700 compute nodes). Our > storage (ESS GL6) is also running GPFS 4.2. Compute nodes and storage are > connected via Infiniband (FDR14). At the time of implementation of ESS, we > were instructed to enable RDMA in addition to IPoIB. Previously we only ran > IPoIB on our GPFS3.5 cluster. > > Every since the implementation (sometime back in July of 2016) we see a > lot of compute nodes being ejected. What usually precedes the ejection are > following messages: > > Jan 11 02:03:15 quser13 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum > 0 vendor_err 135 > Jan 11 02:03:15 quser13 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 2 > Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum > 0 vendor_err 135 > Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error > IBV_WC_WR_FLUSH_ERR index 1 > Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum > 0 vendor_err 135 > Jan 11 02:03:26 quser13 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 2 > Jan 11 02:06:38 quser11 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum > 0 vendor_err 135 > Jan 11 02:06:38 quser11 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.5 (gssio2-fdr) on mlx4_0 port 1 fabnum 0 due to send error > IBV_WC_WR_FLUSH_ERR index 400 > > Even our ESS IO server sometimes ends up being ejected (case in point - > yesterday morning): > > Jan 10 11:23:42 gssio2 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_1 port 1 fabnum > 0 vendor_err 135 > Jan 10 11:23:42 gssio2 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.1 (gssio1-fdr) on mlx5_1 port 1 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 3001 > Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_1 port 2 fabnum > 0 vendor_err 135 > Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.1 (gssio1-fdr) on mlx5_1 port 2 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 2671 > Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_0 port 2 fabnum > 0 vendor_err 135 > Jan 10 11:23:43 gssio2 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.1 (gssio1-fdr) on mlx5_0 port 2 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 2495 > Jan 10 11:23:44 gssio2 mmfs: [E] VERBS RDMA rdma send error > IBV_WC_RNR_RETRY_EXC_ERR to 172.41.2.1 (gssio1-fdr) on mlx5_0 port 1 fabnum > 0 vendor_err 135 > Jan 10 11:23:44 gssio2 mmfs: [E] VERBS RDMA closed connection to > 172.41.2.1 (gssio1-fdr) on mlx5_0 port 1 fabnum 0 due to send error > IBV_WC_RNR_RETRY_EXC_ERR index 3077 > Jan 10 11:24:11 gssio2 mmfs: [N] Node 172.41.2.1 (gssio1-fdr) lease > renewal is overdue. Pinging to check if it is alive > > I've had multiple PMRs open for this issue, and I am told that our ESS > needs code level upgrades in order to fix this issue. Looking at the > errors, I think the issue is Infiniband related, and I am wondering if > anyone on this list has seen similar issues? > > Thanks for your help in advance. > > Damir_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > Ellei edellä ole toisin mainittu: / Unless stated otherwise above: > Oy IBM Finland Ab > PL 265, 00101 Helsinki, Finland > Business ID, Y-tunnus: 0195876-3 > Registered in Finland > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
