Re: [gpfsug-discuss] cpu shielding
Vaguely related, we used to see the out of memory killer regularly go for mmfsd, which should kill user process and pbs_mom which ran from gpfs. We modified the gpfs init script to set the score for mmfsd for oom to help prevent this. (we also modified it to wait for ib to come up as well, need to revisit this now I guess as there is systemd support in 4.2.0.1 so we should be able to set a .wants there). Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Bryan Banister [bbanis...@jumptrading.com] Sent: 02 March 2016 20:17 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] cpu shielding I would agree with Vic that in most cases the issues are with the underlying network communication. We are using the cgroups to mainly protect against runaway processes that attempt to consume all memory on the system, -Bryan -Original Message- From: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of viccorn...@gmail.com Sent: Wednesday, March 02, 2016 2:15 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] cpu shielding Hi, How sure are you that it is cpu scheduling that is your problem? Are you using IB or Ethernet? I have seen problems that look like yours in the past with single-network Ethernet setups. Regards, Vic Sent from my iPhone > On 2 Mar 2016, at 20:54, Matt Weilwrote: > > Can you share anything more? > We are trying all system related items on cpu0 GPFS is on cpu1 and the > rest are used for the lsf scheduler. With that setup we still see > evictions. > > Thanks > Matt > >> On 3/2/16 1:49 PM, Bryan Banister wrote: >> We do use cgroups to isolate user applications into a separate cgroup which >> provides some headroom of CPU and memory resources for the rest of the >> system services including GPFS and its required components such SSH, etc. >> -B >> >> -Original Message- >> From: gpfsug-discuss-boun...@spectrumscale.org >> [mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Matt >> Weil >> Sent: Wednesday, March 02, 2016 1:47 PM >> To: gpfsug main discussion list >> Subject: [gpfsug-discuss] cpu shielding >> >> All, >> >> We are seeing issues on our GPFS clients where mmfsd is not able to respond >> in time to renew its lease. Once that happens the file system is unmounted. >> We are experimenting with c groups to tie mmfsd and others to specified >> cpu's. Any recommendations out there on how to shield GPFS from other >> process? >> >> Our system design has all PCI going through the first socket and that seems >> to be some contention there as the RAID controller with SSD's and nics are >> on that same bus. >> >> Thanks >> >> Matt >> >> >> >> This email message is a private communication. The information transmitted, >> including attachments, is intended only for the person or entity to which it >> is addressed and may contain confidential, privileged, and/or proprietary >> material. Any review, duplication, retransmission, distribution, or other >> use of, or taking of any action in reliance upon, this information by >> persons or entities other than the intended recipient is unauthorized by the >> sender and is prohibited. If you have received this message in error, please >> contact the sender immediately by return email and delete the original >> message from all computer systems. Thank you. >> ___ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> >> Note: This email is for the confidential use of the named addressee(s) only >> and may contain proprietary, confidential or privileged information. If you >> are not the intended recipient, you are hereby notified that any review, >> dissemination or copying of this email is strictly prohibited, and to please >> notify the sender immediately and destroy this email and any attachments. >> Email transmission cannot be guaranteed to be secure or error-free. The >> Company, therefore, does not make any guarantees as to the completeness or >> accuracy of this email or any attachments. This email is for informational >> purposes only and does not constitute a recommendation, offer, request or >> solicitation of any kind to buy, sell, subscribe, redeem or perform any type >> of transaction of a financial product. >> ___ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > This email message is a private communication. The information transmitted, > including attachments, is intended only for the person or entity to which it > is addressed and may contain confidential, privileged, and/or proprietary >
Re: [gpfsug-discuss] cpu shielding
We do use cgroups to isolate user applications into a separate cgroup which provides some headroom of CPU and memory resources for the rest of the system services including GPFS and its required components such SSH, etc. -B -Original Message- From: gpfsug-discuss-boun...@spectrumscale.org [mailto:gpfsug-discuss-boun...@spectrumscale.org] On Behalf Of Matt Weil Sent: Wednesday, March 02, 2016 1:47 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] cpu shielding All, We are seeing issues on our GPFS clients where mmfsd is not able to respond in time to renew its lease. Once that happens the file system is unmounted. We are experimenting with c groups to tie mmfsd and others to specified cpu's. Any recommendations out there on how to shield GPFS from other process? Our system design has all PCI going through the first socket and that seems to be some contention there as the RAID controller with SSD's and nics are on that same bus. Thanks Matt This email message is a private communication. The information transmitted, including attachments, is intended only for the person or entity to which it is addressed and may contain confidential, privileged, and/or proprietary material. Any review, duplication, retransmission, distribution, or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is unauthorized by the sender and is prohibited. If you have received this message in error, please contact the sender immediately by return email and delete the original message from all computer systems. Thank you. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] AFM over NFS vs GPFS
Hi Luke, Assuming the network between your clusters is reliable, using GPFS with SW-mode (also assuming you aren't ever modifying the data on the home cluster) should work well for you I think. New files can continue to be created in the cache even in unmounted state Dean IBM Almaden Research Center From: Luke RaimbachTo: gpfsug main discussion list Date: 03/01/2016 04:44 AM Subject:[gpfsug-discuss] AFM over NFS vs GPFS Sent by:gpfsug-discuss-boun...@spectrumscale.org HI All, We have two clusters and are using AFM between them to compartmentalise performance. We have the opportunity to run AFM over GPFS protocol (over IB verbs), which I would imagine gives much greater performance than trying to push it over NFS over Ethernet. We will have a whole raft of instrument ingest filesets in one storage cluster which are single-writer caches of the final destination in the analytics cluster. My slight concern with running this relationship over native GPFS is that if the analytics cluster goes offline (e.g. for maintenance, etc.), there is an entry in the manual which says: "In the case of caches based on native GPFS™ protocol, unavailability of the home file system on the cache cluster puts the caches into unmounted state. These caches never enter the disconnected state. For AFM filesets that use GPFS protocol to connect to the home cluster, if the remote mount becomes unresponsive due to issues at the home cluster not related to disconnection (such as a deadlock), operations that require remote mount access such as revalidation or reading un-cached contents also hang until remote mount becomes available again. One way to continue accessing all cached contents without disruption is to temporarily disable all the revalidation intervals until the home mount is accessible again." What I'm unsure of is whether this applies to single-writer caches as they (presumably) never do revalidation. We don't want instrument data capture to be interrupted on our ingest storage cluster if the analytics cluster goes away. Is anyone able to clear this up, please? Cheers, Luke. Luke Raimbach Senior HPC Data and Storage Systems Engineer, The Francis Crick Institute, Gibbs Building, 215 Euston Road, London NW1 2BE. E: luke.raimb...@crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] AFM over NFS vs GPFS
Hi Luke, Assuming the network between your clusters is reliable, using GPFS with SW-mode (also assuming you aren't ever modifying the data on the home cluster) should work well for you I think. New files can continue to be created in the cache even in unmounted state Dean IBM Almaden Research Center From: Luke RaimbachTo: gpfsug main discussion list Date: 03/01/2016 04:44 AM Subject:[gpfsug-discuss] AFM over NFS vs GPFS Sent by:gpfsug-discuss-boun...@spectrumscale.org HI All, We have two clusters and are using AFM between them to compartmentalise performance. We have the opportunity to run AFM over GPFS protocol (over IB verbs), which I would imagine gives much greater performance than trying to push it over NFS over Ethernet. We will have a whole raft of instrument ingest filesets in one storage cluster which are single-writer caches of the final destination in the analytics cluster. My slight concern with running this relationship over native GPFS is that if the analytics cluster goes offline (e.g. for maintenance, etc.), there is an entry in the manual which says: "In the case of caches based on native GPFS™ protocol, unavailability of the home file system on the cache cluster puts the caches into unmounted state. These caches never enter the disconnected state. For AFM filesets that use GPFS protocol to connect to the home cluster, if the remote mount becomes unresponsive due to issues at the home cluster not related to disconnection (such as a deadlock), operations that require remote mount access such as revalidation or reading un-cached contents also hang until remote mount becomes available again. One way to continue accessing all cached contents without disruption is to temporarily disable all the revalidation intervals until the home mount is accessible again." What I'm unsure of is whether this applies to single-writer caches as they (presumably) never do revalidation. We don't want instrument data capture to be interrupted on our ingest storage cluster if the analytics cluster goes away. Is anyone able to clear this up, please? Cheers, Luke. Luke Raimbach Senior HPC Data and Storage Systems Engineer, The Francis Crick Institute, Gibbs Building, 215 Euston Road, London NW1 2BE. E: luke.raimb...@crick.ac.uk W: www.crick.ac.uk The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] AFM over NFS vs GPFS
Anybody know the answer? > HI All, > > We have two clusters and are using AFM between them to compartmentalise > performance. We have the opportunity to run AFM over GPFS protocol (over IB > verbs), which I would imagine gives much greater performance than trying to > push it over NFS over Ethernet. > > We will have a whole raft of instrument ingest filesets in one storage cluster > which are single-writer caches of the final destination in the analytics > cluster. > My slight concern with running this relationship over native GPFS is that if > the > analytics cluster goes offline (e.g. for maintenance, etc.), there is an > entry in the > manual which says: > > "In the case of caches based on native GPFS™ protocol, unavailability of the > home file system on the cache cluster puts the caches into unmounted state. > These caches never enter the disconnected state. For AFM filesets that use > GPFS > protocol to connect to the home cluster, if the remote mount becomes > unresponsive due to issues at the home cluster not related to disconnection > (such as a deadlock), operations that require remote mount access such as > revalidation or reading un-cached contents also hang until remote mount > becomes available again. One way to continue accessing all cached contents > without disruption is to temporarily disable all the revalidation intervals > until the > home mount is accessible again." > > What I'm unsure of is whether this applies to single-writer caches as they > (presumably) never do revalidation. We don't want instrument data capture to > be interrupted on our ingest storage cluster if the analytics cluster goes > away. > > Is anyone able to clear this up, please? > > Cheers, > Luke. > > Luke Raimbach > Senior HPC Data and Storage Systems Engineer, The Francis Crick Institute, > Gibbs > Building, > 215 Euston Road, > London NW1 2BE. > > E: luke.raimb...@crick.ac.uk > W: www.crick.ac.uk > > The Francis Crick Institute Limited is a registered charity in England and > Wales > no. 1140062 and a company registered in England and Wales no. 06885462, with > its registered office at 215 Euston Road, London NW1 2BE. > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] IBM-Sandisk Announcement
its direct SAS attached . -- Sven Oehme Scalable Storage Research email: oeh...@us.ibm.com Phone: +1 (408) 824-8904 IBM Almaden Research Lab -- From: "Simon Thompson (Research Computing - IT Services)"To: gpfsug main discussion list Date: 03/02/2016 08:27 AM Subject:Re: [gpfsug-discuss] IBM-Sandisk Announcement Sent by:gpfsug-discuss-boun...@spectrumscale.org There's a bit more at: http://www.theregister.co.uk/2016/03/02/ibm_adds_sandisk_flash_colour_to_its_storage_spectrum/ When I looks as infiniflash briefly it appeared to be ip presented, so guess something like and Linux based system in the "controller". So I guess they have installed gpfs in there as part of the appliance. It doesn't appear to be available as block storage/fc attached from what I could see. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Oesterlin, Robert [robert.oester...@nuance.com] Sent: 02 March 2016 16:22 To: gpfsug main discussion list Subject: [gpfsug-discuss] IBM-Sandisk Announcement Anyone from the IBM side that can comment on this in more detail? (OK if you email me directly) Article is thin on exactly what’s being announced. SanDisk Corporation, a global leader in flash storage solutions, and IBM today announced a collaboration to bring out a unique class of next-generation, software-defined, all-flash storage solutions for the data center. At the core of this collaboration are SanDisk’s InfiniFlash System—a high-capacity and extreme-performance flash-based software defined storage system featuring IBM Spectrum Scale filesystem from IBM. https://www.sandisk.com/about/media-center/press-releases/2016/sandisk-and-ibm-collaborate-to-deliver-software-defined-all-flash-storage-solutions Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] GPFS vs Spectrum Scale
I had a slightly strange discussion with IBM this morning... We typically buy OEM GPFS with out tin. The discussion went along the lines that spectrum scale is different somehow from gpfs via the oem route. Is this just a marketing thing? Red herring? Or is there something more to this? Thanks Simon ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] IBM-Sandisk Announcement
There's a bit more at: http://www.theregister.co.uk/2016/03/02/ibm_adds_sandisk_flash_colour_to_its_storage_spectrum/ When I looks as infiniflash briefly it appeared to be ip presented, so guess something like and Linux based system in the "controller". So I guess they have installed gpfs in there as part of the appliance. It doesn't appear to be available as block storage/fc attached from what I could see. Simon From: gpfsug-discuss-boun...@spectrumscale.org [gpfsug-discuss-boun...@spectrumscale.org] on behalf of Oesterlin, Robert [robert.oester...@nuance.com] Sent: 02 March 2016 16:22 To: gpfsug main discussion list Subject: [gpfsug-discuss] IBM-Sandisk Announcement Anyone from the IBM side that can comment on this in more detail? (OK if you email me directly) Article is thin on exactly what’s being announced. SanDisk Corporation, a global leader in flash storage solutions, and IBM today announced a collaboration to bring out a unique class of next-generation, software-defined, all-flash storage solutions for the data center. At the core of this collaboration are SanDisk’s InfiniFlash System—a high-capacity and extreme-performance flash-based software defined storage system featuring IBM Spectrum Scale filesystem from IBM. https://www.sandisk.com/about/media-center/press-releases/2016/sandisk-and-ibm-collaborate-to-deliver-software-defined-all-flash-storage-solutions Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
[gpfsug-discuss] IBM-Sandisk Announcement
Anyone from the IBM side that can comment on this in more detail? (OK if you email me directly) Article is thin on exactly what’s being announced. SanDisk Corporation, a global leader in flash storage solutions, and IBM today announced a collaboration to bring out a unique class of next-generation, software-defined, all-flash storage solutions for the data center. At the core of this collaboration are SanDisk’s InfiniFlash System—a high-capacity and extreme-performance flash-based software defined storage system featuring IBM Spectrum Scale filesystem from IBM. https://www.sandisk.com/about/media-center/press-releases/2016/sandisk-and-ibm-collaborate-to-deliver-software-defined-all-flash-storage-solutions Bob Oesterlin Sr Storage Engineer, Nuance HPC Grid ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss