Re: Looking for suggestions to deal with large backups not completing in 24-hours

2018-07-18 Thread Lars Henningsen
@All

possibly the biggest issue when backing up massive file systems in parallel 
with multiple dsmc processes is expiration. Once you back up a directory with 
“subdir no”, a no longer existing directory object on that level is expired 
properly and becomes inactive. However everything underneath that remains 
active and doesn’t expire (ever) unless you run a “full” incremental on the 
level above (with “subdir yes”) - and that kind of defeats the purpose of 
parallelisation. Other pitfalls include avoiding swapping, keeping log files 
consistent (dsmc doesn’t do thread awareness when logging - it assumes being 
alone), handling the local dedup cache, updating backup timestamps for a file 
space on the server, distributing load evenly across multiple nodes on a 
scale-out filer, backing up from snapshots, chunking file systems up into even 
parts automatically so you don’t end up with lots of small jobs and one big 
one, dynamically distributing load across multiple “proxies” if one isn’t 
enough, handling exceptions, handling directories with characters you can’t 
parse to dsmc via the command line, consolidating results in a single, 
comprehensible overview similar to the summary of a regular incremental, being 
able to do it all in reverse for a massively parallel restore… the list is 
quite long.

We developed MAGS (as mentioned by Del) to cope with all that - and more. I can 
only recommend trying it out for free.

Regards

Lars Henningsen
General Storage


Re: Looking for suggestions to deal with large backups not completing in 24-hours: the GWDG solution briefly explained

2018-07-18 Thread Bjørn Nachtwey

Hi Skylar,

Skylar Thompson wrote:

One thing to be aware of with partial incremental backups is the danger of
backing up data multiple times if the mount points are nested. For
instance,

/mnt/backup/some-dir
/mnt/backup/some-dir/another-dir

Under normal operation, a node with DOMAIN set to "/mnt/backup/some-dir
/mnt/backup/some-dir/another-dir" will backup the contents of 
/mnt/backup/some-dir/another-dir
as a separate filespace, *and also* will backup another-dir as a
subdirectory of the /mnt/backup/some-dir filespace. We reported this as a
bug, and IBM pointed us at this flag that can be passed as a scheduler
option to prevent this:

-TESTFLAG=VMPUNDERNFSENABLED


good point,
even if my script works a little bit differently:
by now the starting folder is not red from the "dsm.opt" file but given 
in the configuration file for my script "dsmci.cfg". so one run can work 
for one node starting on a subfolder (done si as windows has no 
VIRTUALMOUNTPOINT option)
Within the this config file several starting folders can be declared and 
my script creates in the first step a global list of all folders to be 
backed up "partially incremental"


=> well, i'm not sure if i check for multiple entries in the list
=> and if the nesting is done on a deeper level than the list is created 
from, i think i won't be aware of such a set-up


i will check this -- thanks for the advice!

best
Bjørn


On Tue, Jul 17, 2018 at 04:12:17PM +0200, Bjrn Nachtwey wrote:

Hi Zoltan,

OK, i will translate my text as there are some more approaches discussed :-)

breaking up the filesystems in several nodes will work as long as the nodes
are of suffiecient size.

I'm not sure if a PROXY node will solve the problem, because each "member
node" will backup the whole mountpoint. You will need to do partial
incremental backups. I expect you will do this based on folders, do you?
So, some questions:
1) how will you distribute the folders to the nodes?
2) how will you ensure new folders are processed by one of your "member
nodes"? On our filers many folders are created and deleted, sometimes a
whole bunch every day. So for me, it was no option to maintain the option
file manually. The approach from my script / "MAGS" does this somehow
"automatically".
3) what happens if the folders grew not evenly and all the big ones are
backed up by one of your nodes? (OK you can change the distribution or even
add another node)
4) Are you going to map each backupnode to different nodes of the isilon
cluster to distribute the traffic / workload for the isilon nodes?

best
Bjørn




--
--
Bjørn Nachtwey

Arbeitsgruppe "IT-Infrastruktur“
Tel.: +49 551 201-2181, E-Mail: bjoern.nacht...@gwdg.de
--
Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG)
Am Faßberg 11, 37077 Göttingen, URL: http://www.gwdg.de
Tel.: +49 551 201-1510, Fax: +49 551 201-2150, E-Mail: g...@gwdg.de
Service-Hotline: Tel.: +49 551 201-1523, E-Mail: supp...@gwdg.de
Geschäftsführer: Prof. Dr. Ramin Yahyapour
Aufsichtsratsvorsitzender: Prof. Dr. Norbert Lossau
Sitz der Gesellschaft: Göttingen
Registergericht: Göttingen, Handelsregister-Nr. B 598
--
Zertifiziert nach ISO 9001
--