hi vasily, sven, and is there any advantage in moving the system.log pool to faster storage (like nvdimm) or increasing its default size when HAWC is not used (ie write-cache-threshold kept to 0). (i remember the (very creative) logtip placement on the gss boxes ;)
thanks a lot for the detailed answer stijn On 09/04/2018 05:57 PM, Vasily Tarasov wrote: > Let me add just one more item to Sven's detailed reply: HAWC is especially > helpful to decrease the latencies of small synchronous I/Os that come in > *bursts*. If your workload contains a sustained high rate of writes, the > recovery log will get full very quickly, and HAWC won't help much (or can > even > decrease performance). Making the recovery log larger allows to adsorb longer > I/O bursts. The specific amount of improvements depends on the workload > (how > long/high are bursts, e.g.) and hardware. > Best, > Vasily > -- > Vasily Tarasov, > Research Staff Member, > Storage Systems Research, > IBM Research - Almaden > > ----- Original message ----- > From: Sven Oehme <[email protected]> > To: gpfsug main discussion list <[email protected]> > Cc: Vasily Tarasov <[email protected]> > Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC > Date: Mon, Sep 3, 2018 8:32 AM > Hi Ken, > what the documents is saying (or try to) is that the behavior of data in > inode or metadata operations are not changed if HAWC is enabled, means if > the data fits into the inode it will be placed there directly instead of > writing the data i/o into a data recovery log record (which is what HAWC > uses) and then later destage it where ever the data blocks of a given file > eventually will be written. that also means if all your application does > is > creating small files that fit into the inode, HAWC will not be able to > improve performance. > its unfortunate not so simple to say if HAWC will help or not, but here > are > a couple of thoughts where HAWC will not help and help : > on the where it won't help : > 1. if you have storage device which has very large or even better are log > structured write cache. > 2. if majority of your files are very small > 3. if your files will almost always be accesses sequentially > 4. your storage is primarily flash based > where it most likely will help : > 1. your majority of storage is direct attached HDD (e.g. FPO) with a small > SSD pool for metadata and HAWC > 2. your ratio of clients to storage devices is very high (think hundreds > of > clients and only 1 storage array) > 3. your workload is primarily virtual machines or databases > as always there are lots of exceptions and corner cases, but is the best > list i could come up with. > on how to find out if HAWC could help, there are 2 ways of doing this > first, look at mmfsadm dump iocounters , you see the average size of i/os > and you could check if there is a lot of small write operations done. > a more involved but more accurate way would be to take a trace with trace > level trace=io , that will generate a very lightweight trace of only the > most relevant io layers of GPFS, you could then post process the > operations > performance, but the data is not the simplest to understand for somebody > with low knowledge of filesystems, but if you stare at it for a while it > might make some sense to you. > Sven > On Mon, Sep 3, 2018 at 4:06 PM Kenneth Waegeman <[email protected] > <mailto:[email protected]>> wrote: > > Thank you Vasily and Simon for the clarification! > > I was looking further into it, and I got stuck with more questions :) > > > - In > > https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_tuning.htm > I read: > HAWC does not change the following behaviors: > write behavior of small files when the data is placed in the > inode itself > write behavior of directory blocks or other metadata > > I wondered why? Is the metadata not logged in the (same) recovery > logs? > (It seemed by reading > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_logfile.htm > it does ) > > > - Would there be a way to estimate how much of the write requests on a > running cluster would benefit from enabling HAWC ? > > > Thanks again! > > > Kenneth > On 31/08/18 19:49, Vasily Tarasov wrote: >> That is correct. The blocks of each recovery log are striped across >> the devices in the system.log pool (if it is defined). As a result, >> even when all clients have a local device in the system.log pool, >> many >> writes to the recovery log will go to remote devices. For a client >> that lacks a local device in the system.log pool, log writes will >> always be remote. >> Notice, that typically in such a setup you would enable log >> replication for HA. Otherwise, if a single client fails (and its >> recover log is lost) the whole cluster fails as there is no log to >> recover FS to consistent state. Therefore, at least one remote write >> is essential. >> HTH, >> -- >> Vasily Tarasov, >> Research Staff Member, >> Storage Systems Research, >> IBM Research - Almaden >> >> ----- Original message ----- >> From: Kenneth Waegeman <[email protected]> >> <mailto:[email protected]> >> Sent by: [email protected] >> <mailto:[email protected]> >> To: gpfsug main discussion list >> <[email protected]> >> <mailto:[email protected]> >> Cc: >> Subject: [gpfsug-discuss] system.log pool on client nodes for >> HAWC >> Date: Tue, Aug 28, 2018 5:31 AM >> Hi all, >> >> I was looking into HAWC , using the 'distributed fast storage in >> client >> nodes' method ( >> >> https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm >> >> ) >> >> This is achieved by putting a local device on the clients in the >> system.log pool. Reading another article >> >> (https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm >> >> ) this would now be used for ALL File system recovery logs. >> >> Does this mean that if you have a (small) subset of clients with >> fast >> local devices added in the system.log pool, all other clients >> will use >> these too instead of the central system pool? >> >> Thank you! >> >> Kenneth >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org <http://spectrumscale.org> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
