Re: [gpfsug-discuss] system.log pool on client nodes for HAWC

Simon Thompson Fri, 31 Aug 2018 11:26:23 -0700

I'm going to add a note of caution about HAWC as well...

Firstly this was based on when it was first released,so things might have 
changed...


HAWC replication uses the same failure group policy for placing replicas, 
therefore you need to use different failure groups for different client nodes. 
But do this carefully thinking about your failure domains. For example, we 
initially set each node in a cluster with its own failure group, might seem 
like a good idea until you shut the rack down (or even just a few select nodes 
might do it). You then lose your whole storage cluster by accident. (Or maybe 
you have hpc nodes and no UPS protection, if they have hawk and there is no 
protected replica, you lose the fs).

Maybe this is obvious to everyone, but it bit us in various ways in our early 
testing. So if you plan to implement it, do test how your storage reacts when a 
client node fails.

Simon
________________________________________
From: [email protected] 
[[email protected]] on behalf of [email protected] 
[[email protected]]
Sent: 31 August 2018 18:49
To: [email protected]
Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC

That is correct. The blocks of each recovery log are striped across the devices 
in the system.log pool (if it is defined). As a result, even when all clients 
have a local device in the system.log pool, many writes to the recovery log 
will go to remote devices. For a client that lacks a local device in the 
system.log pool, log writes will always be remote.

Notice, that typically in such a setup you would enable log replication for HA. 
Otherwise, if a single client fails (and its recover log is lost) the whole 
cluster fails as there is no log  to recover FS to consistent state. Therefore, 
at least one remote write is essential.

HTH,
--
Vasily Tarasov,
Research Staff Member,
Storage Systems Research,
IBM Research - Almaden


----- Original message -----
From: Kenneth Waegeman <[email protected]>
Sent by: [email protected]
To: gpfsug main discussion list <[email protected]>
Cc:
Subject: [gpfsug-discuss] system.log pool on client nodes for HAWC
Date: Tue, Aug 28, 2018 5:31 AM

Hi all,

I was looking into HAWC , using the 'distributed fast storage in client
nodes' method (
https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm
)

This is achieved by putting  a local device on the clients in the
system.log pool. Reading another article
(https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm
) this would now be used for ALL File system recovery logs.

Does this mean that if you have a (small) subset of clients with fast
local devices added in the system.log pool, all other clients will use
these too instead of the central system pool?

Thank you!

Kenneth

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Re: [gpfsug-discuss] system.log pool on client nodes for HAWC

Reply via email to