do you have a repeatable test that runs on just a small number of nodes ? we where working (still are) on some mmap enhancements lest year, but didn't get all of them ready for the DCUT of Scale 5.0, so we integrated only a subset that was ready into Scale 5.0 and left the enhancement turned off by default (you can toggle it on/off via mmchconfig). if you are interested in testing this send me a direct mail and i will provide instructions on how to turn this on. it does help mmap read workloads significant (think factors) in testing with synthetic benchmarks in my lab, but i would be very interested in real application results and also interested to get a trace with the new code to see where we need to make further improvements.
sven On Mon, Jan 15, 2018 at 4:26 AM Ray Coetzee <[email protected]> wrote: > Hi Sven > yes, it's primarily reads. > > Kind regards > > Ray Coetzee > Mob: +44 759 704 7060 <+44%207597%20047060> > > Skype: ray.coetzee > > Email: [email protected] > > > On Fri, Jan 12, 2018 at 8:57 PM, Sven Oehme <[email protected]> wrote: > >> is this primary read or write ? >> >> >> On Fri, Jan 12, 2018, 12:51 PM Ray Coetzee <[email protected]> wrote: >> >>> Hey Sven, the latest clients I've tested with is 4.2.3-6 on RHEL7.2. >>> (Without the meltdown patch) >>> >>> Hey Bryan, I remember that quote from Yuri, that's why I hoped some >>> "magic" may have been done since then. >>> >>> Other attempts to improve performance I've tried include: >>> >>> - Using LROC to have a larger chance of a cache hit (Unfortunately >>> the entire dataset is multiple TB) >>> - Built an NVMe based scratch filesystem (18x 1.8TB NVMe) just for >>> this purpose (Job runs halved in duration but nowhere near what NFS can >>> give) >>> - Made changes to prefecthPct, PrefetchAgressiveness, DisableDIO, >>> and some others with little improvement. >>> >>> For those interested, as a performance comparison. The same job when run >>> on an aging Isilon takes 1m30s, while GPFS will take ~38min on the all NVMe >>> scratch filesystem and over 60min on spindle based filesystem. >>> >>> Kind regards >>> >>> Ray Coetzee >>> Email: [email protected] >>> >>> >>> On Fri, Jan 12, 2018 at 4:12 PM, Bryan Banister < >>> [email protected]> wrote: >>> >>>> You could put all of your data onto SSDs in a RAID1 configuration so >>>> that you don’t have insane read-modify-write penalties on writes (RAID1) >>>> and avoid horrible seek thrashing that spinning rust requires (SSD random >>>> access medium) for your 4K I/O operations. >>>> >>>> >>>> >>>> One of my favorite Yuri quotes, “The mmap code is like asbestos… best >>>> not to touch it”. He gave many reasons why mmap operations on a >>>> distributed file system is incredibly hard and not recommended. >>>> >>>> -Bryan >>>> >>>> >>>> >>>> *From:* [email protected] [mailto: >>>> [email protected]] *On Behalf Of *Sven Oehme >>>> *Sent:* Friday, January 12, 2018 8:45 AM >>>> *To:* [email protected]; gpfsug main discussion list < >>>> [email protected]> >>>> *Subject:* Re: [gpfsug-discuss] mmap performance against Spectrum Scale >>>> >>>> >>>> >>>> *Note: External Email* >>>> ------------------------------ >>>> >>>> what version of Scale are you using right now ? >>>> >>>> >>>> >>>> On Fri, Jan 12, 2018 at 2:29 AM Ray Coetzee <[email protected]> >>>> wrote: >>>> >>>> I'd like to ask the group of their experiences in improving the >>>> performance of applications that use mmap calls against files on Spectrum >>>> Scale. >>>> >>>> >>>> >>>> Besides using an NFS export from CES instead of a native GPFS mount, or >>>> precaching the dataset into the pagepool, what other approaches are there >>>> to offset the performance hit of the 4K IO size? >>>> >>>> >>>> >>>> Kind regards >>>> >>>> Ray Coetzee >>>> >>>> _______________________________________________ >>>> gpfsug-discuss mailing list >>>> gpfsug-discuss at spectrumscale.org >>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>>> >>>> >>>> ------------------------------ >>>> >>>> Note: This email is for the confidential use of the named addressee(s) >>>> only and may contain proprietary, confidential or privileged information. >>>> If you are not the intended recipient, you are hereby notified that any >>>> review, dissemination or copying of this email is strictly prohibited, and >>>> to please notify the sender immediately and destroy this email and any >>>> attachments. Email transmission cannot be guaranteed to be secure or >>>> error-free. The Company, therefore, does not make any guarantees as to the >>>> completeness or accuracy of this email or any attachments. This email is >>>> for informational purposes only and does not constitute a recommendation, >>>> offer, request or solicitation of any kind to buy, sell, subscribe, redeem >>>> or perform any type of transaction of a financial product. >>>> >>> >>> _______________________________________________ >>> gpfsug-discuss mailing list >>> gpfsug-discuss at spectrumscale.org >>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >>> >> >
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
