[gpfsug-discuss] SSUG @ SC19 Update: Scheduling and Sponsorship Opportunities

2019-09-16 Thread Oesterlin, Robert
Two months until SC19 and the schedule is starting to come together, with a 
great mix of technical updates and user talks. I would like highlight a few 
items for you to be aware of:

- Morning session: We’re currently trying to put together a morning “new users” 
session for those new to Spectrum Scale. These talks would be focused on 
fundamentals and give an opportunity to ask questions. We’re tentatively 
thinking about starting around 9:30-10 AM on Sunday November 17th. Watch the 
mailing list for updates and on the http://spectrumscale.org site.
- Sponsorships: We’re looking for sponsors. If your company is an IBM partner, 
uses/incorporates Spectrum Scale - please contact myself or Kristy 
Kallback-Rose. We are looking for sponsors to help with lunch (YES - we’d like 
to serve lunch this year!) and WiFi access during the user group meeting.

Looking forward to seeing you all at SC19. Registration link coming soon, watch 
here: 
https://www.spectrumscaleug.org/event/spectrum-scale-user-group-meeting-sc19/

Bob Oesterlin
Sr Principal Storage Engineer, Nuance

___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-09-16 Thread Christopher Black
On our recent ESS systems we do not see /etc/tuned/scale/tuned.conf (or 
script.sh) owned by any package (rpm -qif …).
I’ve attached what we have on our ESS 5.3.3 systems.

Best,
Chris

From:  on behalf of "Wahl, Edward" 

Reply-To: gpfsug main discussion list 
Date: Monday, September 16, 2019 at 10:50 AM
To: gpfsug main discussion list 
Subject: Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be 
expected?

What package provides this /usr/lib/tuned/  file?

Ed



From: gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Olaf Weiser 

Sent: Monday, September 16, 2019 3:12 AM
To: gpfsug main discussion list 
Subject: Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be 
expected?

Hallo Heiner,
usually, Spectrum Scale comes with a tuned profile (named scale) ..

[root@nsd01 ~]# tuned-adm active
Current active profile: scale

in there
[root@nsd01 ~]# cat /etc/tuned/scale/tuned.conf | tail -3
# Disable IPv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
[root@nsd01 ~]#

depending on  what you need to achieve .. one might be forced to changed 
that.. e.g. for RoCE .. you need IPv6 to be active ...
but for all other scenarios with SpectrumScale (at least what I'm aware of 
right now) ... IPv6 can be disabled...







From:"Billich  Heinrich Rainer (ID SD)" 
To:gpfsug main discussion list 
Date:09/13/2019 05:02 PM
Subject:[EXTERNAL] [gpfsug-discuss] Ganesha all IPv6 sockets - ist this 
to be expected?
Sent by:gpfsug-discuss-boun...@spectrumscale.org





Hello,

I just noted that our ganesha daemons offer IPv6 sockets only, IPv4 traffic 
gets encapsulated.  But all traffic to samba is IPv4, smbd offers both IPv4 and 
IPv6 sockets.
I just wonder whether this is to be expected? Protocols support IPv4 only, so 
why running on IPv6 sockets only for ganesha? Did we configure something wrong 
and should completely disable IPv6 on the kernel level

Any comment is welcome

Cheers,
Heiner
--
===
Heinrich Billich
ETH Zürich
Informatikdienste
Tel.: +41 44 632 72 56
heinrich.bill...@id.ethz.ch



I did check with

 ss -l -t -4
 ss -l -t  -6

add  -p to get the process name, too.

do you get the same results on your ces nodes?


[root@nas22ces04-i config_samples]#   ss -l -t   -4
State   Recv-Q Send-Q   
 Local Address:Port 
Peer Address:Port
LISTEN  0  8192 
 *:gpfs 
   *:*
LISTEN  0  50   
 *:netbios-ssn  
   *:*
LISTEN  0  128  
 *:5355 
   *:*
LISTEN  0  128  
 *:sunrpc   
   *:*
LISTEN  0  128  
 *:ssh  
   *:*
LISTEN  0  100  
 127.0.0.1:smtp 
   *:*
LISTEN  0  10   
 10.250.135.24:4379 
   *:*
LISTEN  0  128  
 *:32765
   *:*
LISTEN  0  50   
 *:microsoft-ds 
   *:*
[root@nas22ces04-i config_samples]#   ss -l -t   -6
State   Recv-Q Send-Q   
 Local Address:Port 
Peer Address:Port
LISTEN  0  128  
:::32767
  

Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-09-16 Thread Wahl, Edward
What package provides this /usr/lib/tuned/  file?

Ed



From: gpfsug-discuss-boun...@spectrumscale.org 
 on behalf of Olaf Weiser 

Sent: Monday, September 16, 2019 3:12 AM
To: gpfsug main discussion list 
Subject: Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be 
expected?

Hallo Heiner,
usually, Spectrum Scale comes with a tuned profile (named scale) ..

[root@nsd01 ~]# tuned-adm active
Current active profile: scale

in there
[root@nsd01 ~]# cat /etc/tuned/scale/tuned.conf | tail -3
# Disable IPv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
[root@nsd01 ~]#

depending on  what you need to achieve .. one might be forced to changed 
that.. e.g. for RoCE .. you need IPv6 to be active ...
but for all other scenarios with SpectrumScale (at least what I'm aware of 
right now) ... IPv6 can be disabled...







From:"Billich  Heinrich Rainer (ID SD)" 
To:gpfsug main discussion list 
Date:09/13/2019 05:02 PM
Subject:[EXTERNAL] [gpfsug-discuss] Ganesha all IPv6 sockets - ist this 
to be expected?
Sent by:gpfsug-discuss-boun...@spectrumscale.org





Hello,

I just noted that our ganesha daemons offer IPv6 sockets only, IPv4 traffic 
gets encapsulated.  But all traffic to samba is IPv4, smbd offers both IPv4 and 
IPv6 sockets.
I just wonder whether this is to be expected? Protocols support IPv4 only, so 
why running on IPv6 sockets only for ganesha? Did we configure something wrong 
and should completely disable IPv6 on the kernel level

Any comment is welcome

Cheers,
Heiner
--
===
Heinrich Billich
ETH Zürich
Informatikdienste
Tel.: +41 44 632 72 56
heinrich.bill...@id.ethz.ch



I did check with

 ss -l -t -4
 ss -l -t  -6

add  -p to get the process name, too.

do you get the same results on your ces nodes?


[root@nas22ces04-i config_samples]#   ss -l -t   -4
State   Recv-Q Send-Q   
 Local Address:Port 
Peer Address:Port
LISTEN  0  8192 
 *:gpfs 
   *:*
LISTEN  0  50   
 *:netbios-ssn  
   *:*
LISTEN  0  128  
 *:5355 
   *:*
LISTEN  0  128  
 *:sunrpc   
   *:*
LISTEN  0  128  
 *:ssh  
   *:*
LISTEN  0  100  
 127.0.0.1:smtp 
   *:*
LISTEN  0  10   
 10.250.135.24:4379 
   *:*
LISTEN  0  128  
 *:32765
   *:*
LISTEN  0  50   
 *:microsoft-ds 
   *:*
[root@nas22ces04-i config_samples]#   ss -l -t   -6
State   Recv-Q Send-Q   
 Local Address:Port 
Peer Address:Port
LISTEN  0  128  
:::32767
  :::*
LISTEN  0  128  
:::32768
  :::*
LISTEN  0  128  
:::32769

[gpfsug-discuss] Can 5-minutes frequent lsscsi command disrupt GPFS I/O on a Lenovo system ?

2019-09-16 Thread Dorigo Alvise (PSI)
Hello folks,
recently I observed that calling every 5 minutes the command "lsscsi -g" on a 
Lenovo I/O node (a X3650 M5 connected to D3284 enclosures, part of a DSS-G220 
system) can seriously compromise the GPFS I/O performance.

(The motivation of running lsscsi every 5 minutes is a bit out of topic, but I 
can explain on request).

What we observed is that there were several GPFS waiters telling that flushing 
caches to physical disk was impossible and they had to wait (possibly going in 
timeout).

Is this something expected and/or observed by someone else in this community ?

Thanks
Regards,

   Alvise Dorigo
___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] VerbsReconnectThread waiters

2019-09-16 Thread IBM Spectrum Scale

Damir, Joseph,

> Is this something to pay attention to, and what does this waiter mean?
This waiter means GPFS fails to reconnect broken verbs connection,  which
can cause performance degradation.

> I have seen these on our cluster after the IB network goes down (GPFS
still runs over ethernet) and then comes back up.  They will retry forever
it seems, even after the IB is healthy again.
> Restarting GPFS on the nodes with waiters has fixed the issue for me, I
don't know if IBM has any other tricks to fix this without a restart.

This is a code bug which is fixed through internal defect 1090669. It will
be backport to service releases after verification.
There is a work-around which can fix this problem without a restart.
-   On nodes which have this waiter list, run command 'mmfsadm test
breakconn all 744'
 744 is E_RECONNECT, which triggers tcp reconnect and will not cause
node leave/rejoin. Its side effect clears RDMA connections and their
incorrect status.

Regards, The Spectrum Scale (GPFS) team

--

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.



From:   Joseph Mendoza 
To: gpfsug-discuss@spectrumscale.org
Date:   2019/09/14 12:08 AM
Subject:[EXTERNAL] Re: [gpfsug-discuss] VerbsReconnectThread waiters
Sent by:gpfsug-discuss-boun...@spectrumscale.org



I have seen these on our cluster after the IB network goes down (GPFS still
runs over ethernet) and then comes back up.  They will retry forever it
seems, even after the IB is healthy again.  The effect they seem to have is
that verbs connections between some nodes breaks and GPFS uses
ethernet/ipoib instead.  You may see messages in your mmfs.log.latest about
verbs being disabled "due to too many errors".  You can also see fewer
verbs connections between nodes in "mmfsadm test verbs conn" output.


Restarting GPFS on the nodes with waiters has fixed the issue for me, I
don't know if IBM has any other tricks to fix this without a restart.


--Joey





On 9/12/19 8:16 AM, Damir Krstic wrote:
  On my cluster I have seen couple of long waiters such as this:

  gss01: Waiting 16.8543 sec since 09:07:02, ignored, thread 46230
  VerbsReconnectThread: delaying for 43.145624000 more seconds, reason:
  delaying for next reconnect attempt

  I tried searching on gpfs wiki for this type of waiter, but was
  unable to find anything of value.

  Is this something to pay attention to, and what does this waiter
  mean?

  Thank you.
  Damir

  ___
  gpfsug-discuss mailing list
  gpfsug-discuss at spectrumscale.org
  http://gpfsug.org/mailman/listinfo/gpfsug-discuss
  ___
  gpfsug-discuss mailing list
  gpfsug-discuss at spectrumscale.org
  
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss=DwICAg=jf_iaSHvJObTbx-siA1ZOg=IbxtjdkPAM2Sbon4Lbbi4w=WoT3TYlCvAM8RQxUISD9L6UzqY0I_ffCJTS-UHhw8z4=18A0j0Zmp8OwZ6Y6cc3HFe3OgFZRHIv8OeJcBpkaPwQ=



___
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


Re: [gpfsug-discuss] Ganesha all IPv6 sockets - ist this to be expected?

2019-09-16 Thread Olaf Weiser

Hallo Heiner, usually, Spectrum Scale comes with a
tuned profile (named scale) .. [root@nsd01 ~]# tuned-adm active Current active profile: scalein there [root@nsd01 ~]# cat /etc/tuned/scale/tuned.conf |
tail -3  # Disable IPv6 net.ipv6.conf.all.disable_ipv6=1 net.ipv6.conf.default.disable_ipv6=1 [root@nsd01 ~]# depending on  what you need to achieve
.. one might be forced to changed that.. e.g. for RoCE .. you need IPv6
to be active ... but for all other scenarios with SpectrumScale
(at least what I'm aware of right now) ... IPv6 can be disabled... From:      
 "Billich  Heinrich
Rainer (ID SD)" To:      
 gpfsug main discussion
list Date:      
 09/13/2019 05:02 PMSubject:    
   [EXTERNAL] [gpfsug-discuss]
Ganesha all IPv6 sockets - ist this to be expected?Sent by:    
   gpfsug-discuss-boun...@spectrumscale.orgHello,I just noted that our ganesha daemons offer IPv6 sockets only, IPv4 traffic
gets encapsulated.  But all traffic to samba is IPv4, smbd offers
both IPv4 and IPv6 sockets. I just wonder whether this is to be expected? Protocols support IPv4 only,
so why running on IPv6 sockets only for ganesha? Did we configure something
wrong and should completely disable IPv6 on the kernel levelAny comment is welcomeCheers,Heiner-- ===Heinrich BillichETH ZürichInformatikdiensteTel.: +41 44 632 72 56heinrich.bill...@id.ethz.ch I did check with  ss -l -t -4  ss -l -t  -6add  -p to get the process name, too.do you get the same results on your ces nodes?[root@nas22ces04-i config_samples]#   ss -l -t   -4State       Recv-Q Send-Q          
                     
                     
             Local Address:Port  
                     
                     
                     
                Peer Address:PortLISTEN      0      8192      
                     
                     
                     
         *:gpfs          
                     
                     
                     
                   *:*LISTEN      0      50      
                     
                     
                     
           *:netbios-ssn      
                     
                     
                     
                *:*LISTEN      0      128      
                     
                     
                     
          *:5355          
                     
                     
                     
                   *:*LISTEN      0      128      
                     
                     
                     
          *:sunrpc        
                     
                     
                     
                   *:*LISTEN      0      128      
                     
                     
                     
          *:ssh          
                     
                     
                     
                    *:*LISTEN      0      100      
                     
                     
                     
  127.0.0.1:smtp              
                     
                     
                     
               *:*LISTEN      0      10      
                     
                     
                     10.250.135.24:4379
                     
                     
                     
                     
       *:*LISTEN      0      128      
                     
                     
                     
          *:32765        
                     
                     
                     
                    *:*LISTEN      0      50      
                     
                     
                     
           *:microsoft-ds      
                     
                     
                     
               *:*[root@nas22ces04-i config_samples]#   ss -l -t   -6State       Recv-Q Send-Q          
                     
                     
             Local Address:Port  
                     
                     
                     
                Peer Address:PortLISTEN      0      128      
                     
                     
                     
         :::32767        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::32768        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::32769        
                     
                     
                     
                   :::*LISTEN      0      128      
                     
                     
                     
         :::2049          
                     
                     
                     
                  :::*LISTEN      0      128      
                     
                     
                     
         :::5355          
                     
                     
                     
                  :::*LISTEN