Let me add just one more item to Sven's detailed reply: HAWC is especially helpful to decrease the latencies of small synchronous I/Os that come in *bursts*. If your workload contains a sustained high rate of writes, the recovery log will get full very quickly, and HAWC won't help much (or can even decrease performance). Making the recovery log larger allows to adsorb longer I/O bursts. The specific amount of improvements depends on the workload (how long/high are bursts, e.g.) and hardware.
Best,
Vasily
--
Vasily Tarasov,
Research Staff Member,
Storage Systems Research,
IBM Research - Almaden
----- Original message -----
From: Sven Oehme <[email protected]>
To: gpfsug main discussion list <[email protected]>
Cc: Vasily Tarasov <[email protected]>
Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC
Date: Mon, Sep 3, 2018 8:32 AM
Hi Ken,what the documents is saying (or try to) is that the behavior of data in inode or metadata operations are not changed if HAWC is enabled, means if the data fits into the inode it will be placed there directly instead of writing the data i/o into a data recovery log record (which is what HAWC uses) and then later destage it where ever the data blocks of a given file eventually will be written. that also means if all your application does is creating small files that fit into the inode, HAWC will not be able to improve performance.its unfortunate not so simple to say if HAWC will help or not, but here are a couple of thoughts where HAWC will not help and help :on the where it won't help :1. if you have storage device which has very large or even better are log structured write cache.2. if majority of your files are very small3. if your files will almost always be accesses sequentially4. your storage is primarily flash basedwhere it most likely will help :1. your majority of storage is direct attached HDD (e.g. FPO) with a small SSD pool for metadata and HAWC2. your ratio of clients to storage devices is very high (think hundreds of clients and only 1 storage array)3. your workload is primarily virtual machines or databasesas always there are lots of exceptions and corner cases, but is the best list i could come up with.on how to find out if HAWC could help, there are 2 ways of doing thisfirst, look at mmfsadm dump iocounters , you see the average size of i/os and you could check if there is a lot of small write operations done.a more involved but more accurate way would be to take a trace with trace level trace=io , that will generate a very lightweight trace of only the most relevant io layers of GPFS, you could then post process the operations performance, but the data is not the simplest to understand for somebody with low knowledge of filesystems, but if you stare at it for a while it might make some sense to you.SvenOn Mon, Sep 3, 2018 at 4:06 PM Kenneth Waegeman <[email protected]> wrote:Thank you Vasily and Simon for the clarification!
I was looking further into it, and I got stuck with more questions :)
I wondered why? Is the metadata not logged in the (same) recovery logs? (It seemed by reading https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_logfile.htm it does )
- In https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_tuning.htm I read:
HAWC does not change the following behaviors:
write behavior of small files when the data is placed in the inode itself
write behavior of directory blocks or other metadata
- Would there be a way to estimate how much of the write requests on a running cluster would benefit from enabling HAWC ?
Thanks again!
Kenneth_______________________________________________On 31/08/18 19:49, Vasily Tarasov wrote:That is correct. The blocks of each recovery log are striped across the devices in the system.log pool (if it is defined). As a result, even when all clients have a local device in the system.log pool, many writes to the recovery log will go to remote devices. For a client that lacks a local device in the system.log pool, log writes will always be remote.Notice, that typically in such a setup you would enable log replication for HA. Otherwise, if a single client fails (and its recover log is lost) the whole cluster fails as there is no log to recover FS to consistent state. Therefore, at least one remote write is essential.HTH,--Vasily Tarasov,Research Staff Member,Storage Systems Research,IBM Research - Almaden----- Original message -----
From: Kenneth Waegeman <[email protected]>
Sent by: [email protected]
To: gpfsug main discussion list <[email protected]>
Cc:
Subject: [gpfsug-discuss] system.log pool on client nodes for HAWC
Date: Tue, Aug 28, 2018 5:31 AM
Hi all,
I was looking into HAWC , using the 'distributed fast storage in client
nodes' method (
https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm
)
This is achieved by putting a local device on the clients in the
system.log pool. Reading another article
(https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm
) this would now be used for ALL File system recovery logs.
Does this mean that if you have a (small) subset of clients with fast
local devices added in the system.log pool, all other clients will use
these too instead of the central system pool?
Thank you!
Kenneth
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
