Re: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread Jaap Jan Ouwehand
Hi, I also used a custom script (database driven) via cron which creates many fileset snapshots during the day via the "default helper nodes". Because of the iops, the oldest snapshots are deleted at night. Perhaps it's a good idea to take one global filesystem snapshot and make it available

Re: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread mark . bergman
Big vote for cron jobs. Our snapshot are created by a script, installed on each GPFS node. The script handles naming, removing old snapshots, checking that sufficient disk space exists before creating a snapshot, etc. We do snapshots every 15 minutes, keeping them with lower frequency over

Re: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread Hannappel, Juergen
Hi, I use a python script via cron job, it checks how many snapshots exist and removes those that exceed a configurable limit, then creates a new one. Deployed via puppet it's much less hassle than click around in a GUI/ > From: "Kidger, Daniel" > To: "gpfsug main discussion list" > Sent:

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Jordi Caubet Serrabou
Maybe some colleagues at IBM devel can correct me, but pagepool size should not make much difference. Afaik, it is mostly read cache data. Another think could be if using HAWC function, I am not sure in such case. Anyhow, looking at your node name, your system seems a DSS from Lenovo so you

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
That's true, although I would not expect the memory to be flushed for just snapshots deletion. But it could well be a problem at snapshot creation time. Anyway for changing the pagepool we should contact the vendor, since this is configured by their installation scripts, so we better have them

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
Ok that sounds a good candidate for an improvement. Thanks. We didn't want to do a full filesystem snapshot for the space consumption indeed. But we may consider it, keeping an eye on the space. Cheers, Ivano __ Paul Scherrer Institut Ivano Talamo

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
Sure, that makes a lot of sense and we were already doing in that way. Cheers, Ivano __ Paul Scherrer Institut Ivano Talamo WHGA/038 Forschungsstrasse 111 5232 Villigen PSI Schweiz Telefon: +41 56 310 47 11 E-Mail: ivano.tal...@psi.ch

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
Hi Jordi, thanks for the explanation, I can now see better why something like that would happen. Indeed the cluster has a lot of clients, coming via different clusters and even some NFS/SMB via protocol nodes. So I think opening a case makes a lot of sense to track it down. Not sure how we

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Alec
Might it be a case of being over built? In the old days you could really mess up an Oracle DW by giving it too much RAM... It would spend all day reading in and out data to the ram that it didn't really need, because it had the SGA available to load the whole table. Perhaps the pagepool is so

Re: [gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread Kidger, Daniel
Simon, Thanks - that is a good insight. The HA 'feature' of the snapshot automation is perhaps a key feature as Linux still lacks a decent 'cluster cron' Also, If "HA" do we know where the state is centrally kept? On the point of snapshots being left undeleted, do you ever use

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Olaf Weiser
keep in mind... creating many snapshots... means ;-) .. you'll have to delete many snapshots.. at a certain level, which depends on #files, #directories, ~workload, #nodes, #networks etc we ve seen cases, where generating just full snapshots (whole file system)  is the better approach instead

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Jan-Frode Myklebust
Also, if snapshotting multiple filesets, it's important to group these into a single mmcrsnapshot command. Then you get a single quiesce, instead of one per fileset. i.e. do: snapname=$(date --utc +@GMT-%Y.%m.%d-%H.%M.%S) mmcrsnapshot gpfs0

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Jordi Caubet Serrabou
Ivano,   if it happens frequently, I would recommend to open a support case.   The creation or deletion of a snapshot requires a quiesce of the nodes to obtain a consistent point-in-time image of the file system and/or update some internal structures afaik. Quiesce is required for nodes at the

Re: [gpfsug-discuss] [External] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread Simon Thompson2
I always used the GUI for automating snapshots that were tagged with the YYMMDD format so that they were accessible via the previous versions tab from CES access. This requires no locking if you have multiple GUI servers running, so in theory the snapshots creation is "HA". BUT if you shutdown

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
Hello Andrew, Thanks for your questions. We're not experiencing any other issue/slowness during normal activity. The storage is a Lenovo DSS appliance with a dedicated SSD enclosure/pool for metadata only. The two NSD servers have 750GB of RAM and 618 are configured as pagepool. The

[gpfsug-discuss] Automating Snapshots : cron jobs or use the GUI ?

2022-02-02 Thread Kidger, Daniel
Hi all, Since the subject of snapshots has come up, I also have a question ... Snapshots can be created from the command line with mmcrsnapshot, and hence can be automated via con jobs etc. Snapshots can also be created from the Scale GUI. The GUI also provides its own automation for the

Re: [gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Andrew Beattie
Ivano, How big is the filesystem in terms of number of files? How big is the filesystem in terms of capacity? Is the Metadata on Flash or Spinning disk? Do you see issues when users do an LS of the filesystem or only when you are doing snapshots. How much memory do the NSD servers have? How

[gpfsug-discuss] snapshots causing filesystem quiesce

2022-02-02 Thread Talamo Ivano Giuseppe (PSI)
Dear all, Since a while we are experiencing an issue when dealing with snapshots. Basically what happens is that when deleting a fileset snapshot (and maybe also when creating new ones) the filesystem becomes inaccessible on the clients for the duration of the operation (can take a few