I'm going to add a note of caution about HAWC as well... Firstly this was based on when it was first released,so things might have changed...
HAWC replication uses the same failure group policy for placing replicas, therefore you need to use different failure groups for different client nodes. But do this carefully thinking about your failure domains. For example, we initially set each node in a cluster with its own failure group, might seem like a good idea until you shut the rack down (or even just a few select nodes might do it). You then lose your whole storage cluster by accident. (Or maybe you have hpc nodes and no UPS protection, if they have hawk and there is no protected replica, you lose the fs). Maybe this is obvious to everyone, but it bit us in various ways in our early testing. So if you plan to implement it, do test how your storage reacts when a client node fails. Simon ________________________________________ From: [email protected] [[email protected]] on behalf of [email protected] [[email protected]] Sent: 31 August 2018 18:49 To: [email protected] Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC That is correct. The blocks of each recovery log are striped across the devices in the system.log pool (if it is defined). As a result, even when all clients have a local device in the system.log pool, many writes to the recovery log will go to remote devices. For a client that lacks a local device in the system.log pool, log writes will always be remote. Notice, that typically in such a setup you would enable log replication for HA. Otherwise, if a single client fails (and its recover log is lost) the whole cluster fails as there is no log to recover FS to consistent state. Therefore, at least one remote write is essential. HTH, -- Vasily Tarasov, Research Staff Member, Storage Systems Research, IBM Research - Almaden ----- Original message ----- From: Kenneth Waegeman <[email protected]> Sent by: [email protected] To: gpfsug main discussion list <[email protected]> Cc: Subject: [gpfsug-discuss] system.log pool on client nodes for HAWC Date: Tue, Aug 28, 2018 5:31 AM Hi all, I was looking into HAWC , using the 'distributed fast storage in client nodes' method ( https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm ) This is achieved by putting a local device on the clients in the system.log pool. Reading another article (https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm ) this would now be used for ALL File system recovery logs. Does this mean that if you have a (small) subset of clients with fast local devices added in the system.log pool, all other clients will use these too instead of the central system pool? Thank you! Kenneth _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
