hi vasily, sven,

and is there any advantage in moving the system.log pool to faster
storage (like nvdimm) or increasing its default size when HAWC is not
used (ie write-cache-threshold kept to 0). (i remember the (very
creative) logtip placement on the gss boxes ;)

thanks a lot for the detailed answer

stijn

On 09/04/2018 05:57 PM, Vasily Tarasov wrote:
> Let me add just one more item to Sven's detailed reply: HAWC is especially 
> helpful to decrease the latencies of small synchronous I/Os that come in 
> *bursts*. If your workload contains a sustained high rate of writes, the 
> recovery log will get full very quickly, and HAWC won't help much (or can 
> even 
> decrease performance). Making the recovery log larger allows to adsorb longer 
> I/O bursts. The specific amount of improvements depends  on  the workload 
> (how 
> long/high are bursts, e.g.) and hardware.
> Best,
> Vasily
> --
> Vasily Tarasov,
> Research Staff Member,
> Storage Systems Research,
> IBM Research - Almaden
> 
>     ----- Original message -----
>     From: Sven Oehme <[email protected]>
>     To: gpfsug main discussion list <[email protected]>
>     Cc: Vasily Tarasov <[email protected]>
>     Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC
>     Date: Mon, Sep 3, 2018 8:32 AM
>     Hi Ken,
>     what the documents is saying (or try to) is that the behavior of data in
>     inode or metadata operations are not changed if HAWC is enabled, means if
>     the data fits into the inode it will be placed there directly instead of
>     writing the data i/o into a data recovery log record (which is what HAWC
>     uses) and then later destage it where ever the data blocks of a given file
>     eventually will be written. that also means if all your application does 
> is
>     creating small files that fit into the inode, HAWC will not be able to
>     improve performance.
>     its unfortunate not so simple to say if HAWC will help or not, but here 
> are
>     a couple of thoughts where HAWC will not help and help :
>     on the where it won't help :
>     1. if you have storage device which has very large or even better are log
>     structured write cache.
>     2. if majority of your files are very small
>     3. if your files will almost always be accesses sequentially
>     4. your storage is primarily flash based
>     where it most likely will help :
>     1. your majority of storage is direct attached HDD (e.g. FPO) with a small
>     SSD pool for metadata and HAWC
>     2. your ratio of clients to storage devices is very high (think hundreds 
> of
>     clients and only 1 storage array)
>     3. your workload is primarily virtual machines or databases
>     as always there are lots of exceptions and corner cases, but is the best
>     list i could come up with.
>     on how to find out if HAWC could help, there are 2 ways of doing this
>     first, look at mmfsadm dump iocounters , you see the average size of i/os
>     and you could check if there is a lot of small write operations done.
>     a more involved but more accurate way would be to take a trace with trace
>     level trace=io , that will generate a very lightweight trace of only the
>     most relevant io layers of GPFS, you could then post process the 
> operations
>     performance, but the data is not the simplest to understand for somebody
>     with low knowledge of filesystems, but if you stare at it for a while it
>     might make some sense to you.
>     Sven
>     On Mon, Sep 3, 2018 at 4:06 PM Kenneth Waegeman <[email protected]
>     <mailto:[email protected]>> wrote:
> 
>         Thank you Vasily and Simon for the clarification!
> 
>         I was looking further into it, and I got stuck with more questions :)
> 
> 
>         - In
>         
> https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_tuning.htm
>         I read:
>              HAWC does not change the following behaviors:
>                  write behavior of small files when the data is placed in the
>         inode itself
>                  write behavior of directory blocks or other metadata
> 
>         I wondered why? Is the metadata not logged in the (same) recovery 
> logs?
>         (It seemed by reading
>         
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_logfile.htm
>         it does )
> 
> 
>         - Would there be a way to estimate how much of the write requests on a
>         running cluster would benefit from enabling HAWC ?
> 
> 
>         Thanks again!
> 
> 
>         Kenneth
>         On 31/08/18 19:49, Vasily Tarasov wrote:
>>         That is correct. The blocks of each recovery log are striped across
>>         the devices in the system.log pool (if it is defined). As a result,
>>         even when all clients have a local device in the system.log pool, 
>> many
>>         writes to the recovery log will go to remote devices. For a client
>>         that lacks a local device in the system.log pool, log writes will
>>         always be remote.
>>         Notice, that typically in such a setup you would enable log
>>         replication for HA. Otherwise, if a single client fails (and its
>>         recover log is lost) the whole cluster fails as there is no log  to
>>         recover FS to consistent state. Therefore, at least one remote write
>>         is essential.
>>         HTH,
>>         --
>>         Vasily Tarasov,
>>         Research Staff Member,
>>         Storage Systems Research,
>>         IBM Research - Almaden
>>
>>             ----- Original message -----
>>             From: Kenneth Waegeman <[email protected]>
>>             <mailto:[email protected]>
>>             Sent by: [email protected]
>>             <mailto:[email protected]>
>>             To: gpfsug main discussion list 
>> <[email protected]>
>>             <mailto:[email protected]>
>>             Cc:
>>             Subject: [gpfsug-discuss] system.log pool on client nodes for 
>> HAWC
>>             Date: Tue, Aug 28, 2018 5:31 AM
>>             Hi all,
>>
>>             I was looking into HAWC , using the 'distributed fast storage in
>>             client
>>             nodes' method (
>>             
>> https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm
>>
>>             )
>>
>>             This is achieved by putting  a local device on the clients in the
>>             system.log pool. Reading another article
>>             
>> (https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm
>>
>>             ) this would now be used for ALL File system recovery logs.
>>
>>             Does this mean that if you have a (small) subset of clients with 
>> fast
>>             local devices added in the system.log pool, all other clients 
>> will use
>>             these too instead of the central system pool?
>>
>>             Thank you!
>>
>>             Kenneth
>>
>>             _______________________________________________
>>             gpfsug-discuss mailing list
>>             gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>             http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>         _______________________________________________
>>         gpfsug-discuss mailing list
>>         gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>         _______________________________________________
>         gpfsug-discuss mailing list
>         gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to