Quoting Yaron Daniel <[email protected]>:

Hi

Did u use mmbackup with TSM ?

https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_mmbackup.htm

I have used mmbackup on test mode a few times before, while under gpfs 3.2 and 3.3, but not under 3.5 yet or 4.x series (not installed in our facility yet).

Under both 3.2 and 3.3 mmbackup would always lock up our cluster when using snapshot. I never understood the behavior without snapshot, and the lock up was intermittent in the carved-out small test cluster, so I never felt confident enough to deploy over the larger 4000+ clients cluster.

Another issue was that the version of mmbackup then would not let me choose the client environment associated with a particular gpfs file system, fileset or path, and the equivalent storage pool and /or policy on the TSM side.

With the native TSM client we can do this by configuring the dsmenv file, and even the NODEMANE/ASNODE, etc, with which to access TSM, so we can keep the backups segregated on different pools/tapes if necessary (by user, by group, by project, etc)

The problem we all agree on is that TSM client traversing is VERY SLOW, and can not be parallelized. I always knew that the mmbackup client was supposed to replace the TSM client for the traversing, and then parse the "necessary parameters" and files to the native TSM client, so it could then take over for the remainder of the workflow.

Therefore, the remaining problems are as follows:
* I never understood the snapshot induced lookup, and how to fix it. Was it due to the size of our cluster or the version of GPFS? Has it been addressed under 3.5 or 4.x series? Without the snapshot how would mmbackup know what was already gone to backup since the previous incremental backup? Does it check each file against what is already on TSM to build the list of candidates? What is the experience out there?

* In the v4r2 version of the manual for the mmbackup utility we still don't seem to be able to determine which TSM BA Client dsmenv to use as a parameter. All we can do is choose the --tsm-servers TSMServer[,TSMServer...]] . I can only conclude that all the contents of any backup on the GPFS side will always end-up on a default storage pool and use the standard TSM policy if nothing else is done. I'm now wondering if it would be ok to simply 'source dsmenv' from a shell for each instance of the mmbackup we fire up, in addition to setting up the other MMBACKUP_DSMC_MISC, MMBACKUP_DSMC_BACKUP, ..., etc as described on man page.

* what about the restore side of things? Most mm* commands can only be executed by root. Should we still have to rely on the TSM BA Client (dsmc|dsmj) if unprivileged users want to restore their own stuff?

I guess I'll have to conduct more experiments.




Please also review this :

http://files.gpfsug.org/presentations/2015/SBENDER-GPFS_UG_UK_2015-05-20.pdf


This is pretty good, as a high level overview. Much better than a few others I've seen with the release of the Spectrum Suite, since it focus entirely on GPFS/TSM/backup|(HSM). It would be nice to have some typical implementation examples.



Thanks a lot for the references Yaron, and again thanks for any further comments.
Jaime




Regards





Yaron Daniel
 94 Em Ha'Moshavot Rd

Server, Storage and Data Services - Team Leader
 Petach Tiqva, 49527
Global Technology Services
 Israel
Phone:
+972-3-916-5672


Fax:
+972-3-916-5672


Mobile:
+972-52-8395593


e-mail:
[email protected]


IBM Israel







[email protected] wrote on 03/09/2016 09:56:13 PM:

From: Jaime Pinto <[email protected]>
To: gpfsug main discussion list <[email protected]>
Date: 03/09/2016 09:56 PM
Subject: [gpfsug-discuss] GPFS(snapshot, backup) vs. GPFS(backup
scripts) vs. TSM(backup)
Sent by: [email protected]

Here is another area where I've been reading material from several
sources for years, and in fact trying one solution over the other from
time-to-time in a test environment. However, to date I have not been
able to find a one-piece-document where all these different IBM
alternatives for backup are discussed at length, with the pos and cons
well explained, along with the how-to's.

I'm currently using TSM(built-in backup client), and over the years I
developed a set of tricks to rely on disk based volumes as
intermediate cache, and multiple backup client nodes, to split the
load and substantially improve the performance of the backup compared
to when I first deployed this solution. However I suspect it could
still be improved further if I was to apply tools from the GPFS side
of the equation.

I would appreciate any comments/pointers.

Thanks
Jaime





---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477

----------------------------------------------------------------
This message was sent using IMP at SciNet Consortium, University of
Toronto.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss










         ************************************
          TELL US ABOUT YOUR SUCCESS STORIES
         http://www.scinethpc.ca/testimonials
         ************************************
---
Jaime Pinto
SciNet HPC Consortium  - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.org
University of Toronto
256 McCaul Street, Room 235
Toronto, ON, M5T1W5
P: 416-978-2755
C: 416-505-1477

----------------------------------------------------------------
This message was sent using IMP at SciNet Consortium, University of Toronto.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to