Re: best backup method for millions of small files?

2009-06-16 Thread John Monahan
 -Original Message-
 From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf
 Of Steven Harris
 Sent: Thursday, April 30, 2009 6:23 PM
 To: ADSM-L@VM.MARIST.EDU
 Subject: Re: best backup method for millions of small files?
 
 Hi Norman
 
 Your post worries me, as I'm just implementing an email archive
 solution
 that will depend on windows journalling to back up some huge
 repositories.
 The particular product fills up containers that  once filled never
 change, so the change rate will be low there, but there are also index
 files that will change often.

I'm way behind on my emails, but thought I would respond to this anyway.


I have worked with some imaging/archiving solutions in the past with
millions of small files.  For the ones that fill up the
containers/directories until full and then never change them again, I
normally implement a combination of archives and backups.  It can be a
somewhat manual process, but may be preferable to some other options if
the numbers of files are extremely large.  In all cases where I have
done this, the recovery time for archived images is often weeks so
restore speed hasn't been a priority.

1.  I archive the containers that are full one time to TSM with
unlimited retention.  
2.  Once a container is archived, it is excluded from the backup
process.
3.  Incremental backups are scheduled every night, but should only
backup/scan containers that are new since the last archive process.
Most data should be excluded/not scanned.
4.  Rerun the archive process once per month or a period that makes
sense based on the number of containers that become full.  Only archive
full containers that haven't yet been archived.  Make sure to add them
to the backup excludes once successfully archived.


__

John Monahan
Infrastructure Services Consultant
Logicalis, Inc.
5500 Wayzata Blvd Suite 315
Golden Valley, MN 55416
Office: 763-226-2088
Mobile: 952-221-6938
Fax:  763-226-2081
john.mona...@us.logicalis.com
http://www.us.logicalis.com


Re: best backup method for millions of small files?

2009-05-01 Thread Howard Coles
In the problem of running out of memory, break up the backups into
chunks.  In other words don't try to backup the whole box with one
scheduled action.  If you have these files across multiple volumes,
backup 1 or 2 at a time.

In the case of the volumes that fill up and never change, are they also
on separate drives, or just separate directories?  The strategy would be
different depending on that.

See Ya'
Howard


 -Original Message-
 From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf
 Of Steven Harris
 Sent: Thursday, April 30, 2009 6:23 PM
 To: ADSM-L@VM.MARIST.EDU
 Subject: Re: [ADSM-L] best backup method for millions of small files?
 
 Hi Norman
 
 Your post worries me, as I'm just implementing an email archive
 solution
 that will depend on windows journalling to back up some huge
 repositories.
 The particular product fills up containers that  once filled never
 change, so the change rate will be low there, but there are also index
 files that will change often.
 
 Have you determined whether the memory issue is related to number of
 files
 or number of changes?
 
 Regards
 
 Steve
 
 Steven Harris
 TSM Admin, Sydney Australia
 
 
 
 
 
  Gee, Norman
  norman@lc.ca
  .GOV
 To
  Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
  Dist Stor
 cc
  Manager
  ads...@vm.marist
 Subject
  .EDU Re: [ADSM-L] best backup method
 for
millions of small files?
 
  01/05/2009 07:12
  AM
 
 
  Please respond to
  ADSM: Dist Stor
  Manager
  ads...@vm.marist
.EDU
 
 
 
 
 
 
 What options are there when journaling runs out of memory on a 32 bit
 Windows server?  I have about 10 million files on one server that the
 journal engine runs out of memory. With memory efficient disk cache
 method and resource utilization 5, its runs out of memory,  resource
 utilization of 4 runs too long.
 
 -Original Message-
 From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf
 Of
 Huebner,Andy,FORT WORTH,IT
 Sent: Thursday, April 30, 2009 8:16 AM
 To: ADSM-L@VM.MARIST.EDU
 Subject: Re: best backup method for millions of small files?
 
 You have a disk array copy of the data, is that located close or far?
 Have you considered a disk array snap shot also?
 If you perform a journaled file system backup and an image backup then
 you should be able to restore the image and then update the image with
 the file system restore.  This might take a long time, I have never
 tried it.
 What failure are you trying to protect against?  In our case we use
the
 disk arrays to protect against a data center loss and a corrupt file
 system and a TSM file system backup to protect against the loss of a
 file.  Our big ones are in the 10 million file range.  Using a 64bit
 Windows server we can backup the file system in about 6 - 8 hours
 without journaling.  We suspect we could get the time down to around 4
 hours if the TSM server was not busy backing up 500 other nodes.
 
 To me the important thing is to figure out what you are protecting
 against with each thing you do.  Also be sure and ask what the
Recovery
 Point Objective (RPO) is.  If it is less than 24 hours then array
based
 solutions may be the best choice.  Over 24 hours then TSM may be the
 best choice.
 
 Andy Huebner
 
 -Original Message-
 From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf
 Of
 Mehdi Salehi
 Sent: Thursday, April 30, 2009 9:39 AM
 To: ADSM-L@VM.MARIST.EDU
 Subject: Re: [ADSM-L] best backup method for millions of small files?
 
 Hi,
 None the two methods that you mean in the user's guide are suitable
for
 my
 case. Image+normal incremental that you emphasized in your post
means
 getting full image backups for example every week. For the incremental
 part,
 one file-based full backup is needed which is a nightmare for 20
 millions.
 OK, if I accept the initial incremental backup time (that might take
 for
 days), what happens in restoration?
 
 Naturally, last image backup should be restored first and it will take
 A
 minutes. Provided that image backups are weekly, the progressive
 incremental
 backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
 files
 are to be restored in filesystem with an incredibly big file address
 table
 and system should create an inode-like entry for each. If this step
 takes B
 minutes, the total restoration time would be A+B. (A+B/A) ratio is
 important
 and I will try to measure and share it with the group.
 
 Steven, your solution is excellent for ordinary filesystems with a
 limited
 number of files. But I think for millions of files, only
backup/restore
 method that do not care how many files exist in the volume are
 feasible.
 Somehing like pure image backup (like Acronis image incremental

Re: best backup method for millions of small files?

2009-05-01 Thread Huebner,Andy,FORT WORTH,IT
Our 32 bit node that backs up large file systems, 9.4 million objects total, 
only has 2 over 1 million and the biggest is 6.5 million and I use the disk 
cache method without any problems.  The server does have the /3gb switch and 
4GB RAM.  This system does not use journaling and is known to be running near 
the limits of a 32 bit process.
We are running with resource utilization set at 10.
For timing it runs in 5-11 hours moving a relatively insignificant amount of 
data.  The faster runs are when the TSM server is not as busy.

It is definitely a balancing act to get it to run at the edge of the limits of 
RAM and still run fast.  We used the information in the TSMSTATS.ini to 
identify file systems with few files and then excluded them from the 
memoryefficientbackup.  If there was a way to change the order that the file 
systems backup we probably could make it run faster by better balancing memory 
usage with the larger file systems.

Andy Huebner
-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Steven 
Harris
Sent: Thursday, April 30, 2009 6:23 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi Norman

Your post worries me, as I'm just implementing an email archive solution
that will depend on windows journalling to back up some huge repositories.
The particular product fills up containers that  once filled never
change, so the change rate will be low there, but there are also index
files that will change often.

Have you determined whether the memory issue is related to number of files
or number of changes?

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





 Gee, Norman
 norman@lc.ca
 .GOV  To
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager
 ads...@vm.marist Subject
 .EDU Re: [ADSM-L] best backup method for
   millions of small files?

 01/05/2009 07:12
 AM


 Please respond to
 ADSM: Dist Stor
 Manager
 ads...@vm.marist
   .EDU






What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored

Re: best backup method for millions of small files?

2009-04-30 Thread Francisco Molero
Hi,

the best option to save millions of file under NTFS is using TSM for Fastback.

  1.- you can run backups incremental for ever. 
   2.- you don't have backup windows.
  3.- you can integrate TSM for FB with TSM 
  4.- you can give service very quickly in case of disaster.





- Mensaje original 
De: Mehdi Salehi iranian.aix.supp...@gmail.com
Para: ADSM-L@VM.MARIST.EDU
Enviado: jueves, 30 de abril, 2009 7:48:47
Asunto: Re: best backup method for millions of small files?

Richard,
The total nightly delta size is about 300MB (less than 20,000 of 20k files)
I am trying to test the journal to verify whether it works with
incremental-by-date for image backups or not. If you have any other
solution, it is welcomed.

Mehdi Salehi





Re: best backup method for millions of small files?

2009-04-30 Thread Mehdi Salehi
Hi,
I enabled the journal service for drive F: of a test windows-based TSM
client. Here is the interesting test results:
1- total filesystem size:14GB
2- used space: 370MB (pdf, doc, exe and ... other ordinary files)
3- I enabled journal
4- snapshot image backup was successful
5- the .jbbdb file in journal directory got populated
6- I copies a directory from c: for f:
7- the size of .jbbdb file changed (means journal service is working)
8- I performed a incremental-by-date for image backup
9- NOTHING was backuped!

Do you have any explanation for this?

Regards,
Mehdi


Re: best backup method for millions of small files?

2009-04-30 Thread Cheung, Richard
When you 'copied' the directory, did the DATE of the directory change?? 

Thats probably why it didn't back anything new up.. 

And thats why incrbydates shouldnt be relied upon always - should always
do normal incremental backups as well


-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, 30 April 2009 5:30 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
I enabled the journal service for drive F: of a test windows-based TSM
client. Here is the interesting test results:
1- total filesystem size:14GB
2- used space: 370MB (pdf, doc, exe and ... other ordinary files)
3- I enabled journal
4- snapshot image backup was successful
5- the .jbbdb file in journal directory got populated
6- I copies a directory from c: for f:
7- the size of .jbbdb file changed (means journal service is working)
8- I performed a incremental-by-date for image backup
9- NOTHING was backuped!

Do you have any explanation for this?

Regards,
Mehdi

html
body
font face=arial color=#808080 size=-2img
alt=Santos Logo src=http://www.santos.com/library/logo.gif;
brSantos Ltd A.B.N. 80 007 550 923br
Disclaimer: The information contained in this email is intended only for the 
use of the person(s) to whom it is addressed and may be confidential or contain 
privileged information. 
If you are not the intended recipient you are hereby notified that any perusal, 
use, distribution, copying or disclosure is strictly prohibited. 
If you have received this email in error please immediately advise us by return 
email and delete the email without making a copy./font
font face=arial color=#008000 size=-2Please consider the environment 
before printing this email/font
/body
/html


Re: best backup method for millions of small files?

2009-04-30 Thread Mehdi Salehi
I added a new directory with 40MB of files in it. This directory is a new
one that was not present when the image backup was performed. I think the
most rudimentary task that is expected from an incremental backup is to
understand newly added files and directories.


Re: best backup method for millions of small files?

2009-04-30 Thread Francisco Molero
If you want to save millions of files in the same directory. Or you change your 
application in order to create a different directory structure or use FastBack.





- Mensaje original 
De: Mehdi Salehi iranian.aix.supp...@gmail.com
Para: ADSM-L@VM.MARIST.EDU
Enviado: jueves, 30 de abril, 2009 10:10:38
Asunto: Re: best backup method for millions of small files?

I added a new directory with 40MB of files in it. This directory is a new
one that was not present when the image backup was performed. I think the
most rudimentary task that is expected from an incremental backup is to
understand newly added files and directories.






Re: best backup method for millions of small files?

2009-04-30 Thread Mehdi Salehi
Francisco,
Thanks for the hint. What is the mechanism used by FastBack that is helpful
for my case?

Thanks so much,
Mehdi


Re: best backup method for millions of small files?

2009-04-30 Thread Francisco Molero
Fastback uses incremental for ever disk block backup and saves all info in a 
disk repository and you can integrate FastBack with TSM. From point of view of 
fastback is the same if you directory have one or one million files becasue the 
backup is a disk block level. When you have to restore you have two 
posibilities amoung several one is replace the whole volume with an instant 
restore and you could give service when it begins to restore or the second 
posibility is, for example, if your data is locate in the K: directory you can 
mount from backup unit K: like unit X: and move files from X: to K: 

Regards,

  Fran




- Mensaje original 
De: Mehdi Salehi iranian.aix.supp...@gmail.com
Para: ADSM-L@VM.MARIST.EDU
Enviado: jueves, 30 de abril, 2009 10:15:37
Asunto: Re: best backup method for millions of small files?

Francisco,
Thanks for the hint. What is the mechanism used by FastBack that is helpful
for my case?

Thanks so much,
Mehdi






SV: best backup method for millions of small files?

2009-04-30 Thread Christian Svensson
Hi Mehdi,
FastBack don't scan each file that TSM does.
FastBack do Incremental backups based on Block-Level and in that case FastBack 
don't care if it is million of small files or 1 large file. It will only backup 
the blocks that is change and always backup the entire partition.

FastBack is a good software and you can run a CDP backup that is also good. But 
the problem with FastBack is that you can only backup from DISK-TO-DISK. And to 
get the data in to TSM do you need to backup the FastBack Server with TSM. Or 
more correct way to say it. You let TSM mount a FastBack snapshot and backup 
that snapshot and you will still have the slow backup to tape. But that will be 
in a offline/off-site mode from the server so no files will change during the 
backup.

Another thing with FastBack, that is still missing is that, you can only 
restore a single file from every snapshot and not from a CDP area if you don't 
want to restore the entire volume to another disk first.

One good thing that I think TSM should have, is that FB is recovering all 
Icons stubs for each file and as soon that is done the users will see his 
files without any problem but in the background will FB restore File A, B, C, D 
and so on. But if a user want to open file M, FB will priorities that file and 
restore that file before he continue with file E, F and H.

I would like to have this function in TSM in the future. I will bring this up 
on the TSM Symposium in Germany in Sept. :) Because you have DIRMC in TSM 
today, why not have FILE_STUB_MC also?

Best Regards
Christian Svensson

Cell: +46-70-325 1577
E-mail: christian.svens...@cristie.se
Skype: cristie.christian.svensson

Från: ADSM: Dist Stor Manager [ads...@vm.marist.edu] f#246;r Mehdi Salehi 
[iranian.aix.supp...@gmail.com]
Skickat: den 30 april 2009 10:15
Till: ADSM-L@VM.MARIST.EDU
Ämne: Re: best backup method for millions of small files?

Francisco,
Thanks for the hint. What is the mechanism used by FastBack that is helpful
for my case?

Thanks so much,
Mehdi


Re: best backup method for millions of small files?

2009-04-30 Thread Dwight Cook
It isn't the backup that will kill you, it is the restore...
Trust me... if you have over 1.5 Million files in a mount point, expect
weeks or months to perform a full restore.

Remember, just because you can put 20 million files in a mount point doesn't
mean it is a good idea... discourage that at all costs.  Also warn the end
user that their data is, for all practical purposes, unrecoverable.  Let's
face it, a business couldn't endure a critical business function outage of 6
or 8 weeks.

 If they insist on that configuration (20M files per mount point) you'll
have to protect that by something such as Image backup.  If they can't
endure the outage to do that, you'll need to utilize SAN disk on the back
end to create a duplicate copy of the mount point and then back that up with
image backup.

Period...


Dwight

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Wednesday, April 29, 2009 11:28 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] best backup method for millions of small files?

Hi,
There is an NTFS filesystem in windows with 20 millions of small files less
than 20 KB each. The total size of the filesystem is less than 300GB.
I have not tried incremental backup, because that would be a nightmare.
Again, inc backup is not a good choice because the long restoration time is
not acceptable us.
I tested the snapshot image backup and satisfied with backup time. Now the
problem is how to perform the incremental for this image backup.
Unfortunately incremental-by-date of last image scans the whole files to
filter which files must be in the backup list and would take a tremendous
amount of time. This is not acceptable as well.
The ideal behavior that I expect from TSM is journal based incremental
backup for image which is apparently not supported by TSM 5.3.
more info:
-the change rate for this filesystem is about 20 thousands of files per day
-TSM version is 5.3

What do you recommend for this case? Please include TSM 6.1 featues if
exist.

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-30 Thread Mehdi Salehi
Dwight,
We tried our best to convince the application side to change the big mount
point, but it is another story. It is a product, and probably one day they
will be abliged to change it. What I need today is the best way to protect
the volume and keep the service highly available. We do have protection
layers such as a standby server and mirrored volumes using Hitachi TrueCopy
to another disk subsystem. Although it is possible to dodge this problem,
you can think of my question as a conceptual one. At least I understood that
FastBack has a feature that TSM lacks. Who knows? perhaps we will see this
in the next version of TSM, 6.2 :)

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-30 Thread Steven Harris
Mehdi

I do not think you understand.

If you are running a journal service you do not need to use incrbydate.

To do what you are trying to do you need two separate backups.  The first
is a periodic image backup, maybe once per week or once every two weeks.
The second is a normal incremental backup, which you run daily.  Because of
the journal service this should not take very long after the first time.

Now the advantage is that when you come to restore the whole disk, TSM can
restore the most recent image, and then using the incremental backup
restore those files which changed after the image was taken.  This is much
faster than restoring each file individually from the incremental backup.

To run your test, you need to run an incremental backup.  This will set up
the journal database ready for changes.   Then run an Image backup.   Next
make some changes and finally run a second incremental backup.

Format your test disk.  Run the image restore using the gui and select the
Image plus incremental directories and files option. See page 107 of the
5.3 Windows BA Client manual for details.   The image will be restored and
the changes applied from the subsequent incremental.

Try it and let us know how it works!

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





 Mehdi Salehi
 iranian.aix.supp
 o...@gmail.com To
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager
 ads...@vm.marist Subject
 .EDU Re: [ADSM-L] best backup method for
   millions of small files?

 30/04/2009 06:10
 PM


 Please respond to
 ADSM: Dist Stor
 Manager
 ads...@vm.marist
   .EDU






I added a new directory with 40MB of files in it. This directory is a new
one that was not present when the image backup was performed. I think the
most rudimentary task that is expected from an incremental backup is to
understand newly added files and directories.


Re: best backup method for millions of small files?

2009-04-30 Thread Mehdi Salehi
Hi,
None the two methods that you mean in the user's guide are suitable for my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental part,
one file-based full backup is needed which is a nightmare for 20 millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K files
are to be restored in filesystem with an incredibly big file address table
and system should create an inode-like entry for each. If this step takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)  or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-30 Thread Huebner,Andy,FORT WORTH,IT
You have a disk array copy of the data, is that located close or far?  Have you 
considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then you 
should be able to restore the image and then update the image with the file 
system restore.  This might take a long time, I have never tried it.
What failure are you trying to protect against?  In our case we use the disk 
arrays to protect against a data center loss and a corrupt file system and a 
TSM file system backup to protect against the loss of a file.  Our big ones are 
in the 10 million file range.  Using a 64bit Windows server we can backup the 
file system in about 6 - 8 hours without journaling.  We suspect we could get 
the time down to around 4 hours if the TSM server was not busy backing up 500 
other nodes.

To me the important thing is to figure out what you are protecting against with 
each thing you do.  Also be sure and ask what the Recovery Point Objective 
(RPO) is.  If it is less than 24 hours then array based solutions may be the 
best choice.  Over 24 hours then TSM may be the best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of Mehdi 
Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental part,
one file-based full backup is needed which is a nightmare for 20 millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K files
are to be restored in filesystem with an incredibly big file address table
and system should create an inode-like entry for each. If this step takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)  or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


This e-mail (including any attachments) is confidential and may be legally 
privileged. If you are not an intended recipient or an authorized 
representative of an intended recipient, you are prohibited from using, copying 
or distributing the information in this e-mail or its attachments. If you have 
received this e-mail in error, please notify the sender immediately by return 
e-mail and delete all copies of this message and any attachments.
Thank you.


Re: best backup method for millions of small files?

2009-04-30 Thread Gee, Norman
What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long. 

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored in filesystem with an incredibly big file address
table
and system should create an inode-like entry for each. If this step
takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is
important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a
limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)
or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


This e-mail (including any attachments) is confidential and may be
legally privileged. If you are not an intended recipient or an
authorized representative of an intended recipient, you are prohibited
from using, copying or distributing the information in this e-mail or
its attachments. If you have received this e-mail in error, please
notify the sender immediately by return e-mail and delete all copies of
this message and any attachments.
Thank you.


Re: best backup method for millions of small files?

2009-04-30 Thread Steven Harris
Hi Norman

Your post worries me, as I'm just implementing an email archive solution
that will depend on windows journalling to back up some huge repositories.
The particular product fills up containers that  once filled never
change, so the change rate will be low there, but there are also index
files that will change often.

Have you determined whether the memory issue is related to number of files
or number of changes?

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





 Gee, Norman
 norman@lc.ca
 .GOV  To
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager
 ads...@vm.marist Subject
 .EDU Re: [ADSM-L] best backup method for
   millions of small files?

 01/05/2009 07:12
 AM


 Please respond to
 ADSM: Dist Stor
 Manager
 ads...@vm.marist
   .EDU






What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored in filesystem with an incredibly big file address
table
and system should create an inode-like entry for each. If this step
takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is
important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a
limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)
or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


This e-mail (including any attachments) is confidential and may be
legally privileged. If you are not an intended recipient or an
authorized representative of an intended recipient, you are prohibited
from using, copying or distributing the information in this e-mail or
its attachments. If you have received this e-mail in error, please
notify the sender immediately by return e-mail and delete all copies of
this message and any attachments.
Thank you.


Re: best backup method for millions of small files?

2009-04-30 Thread Allan Mills
Steve

Had this problem last week and found this Microsoft Technet article.
Fixed the problem after a reboot

http://support.microsoft.com/kb/30401


Regards,
Allan

Allan Mills
Technology Operations
02 8835 8035
0422 208 031



Steven Harris sjhar...@au1.ibm.com
Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
01/05/2009 09:39
Please respond to
ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] best backup method for millions of small files?






Hi Norman

Your post worries me, as I'm just implementing an email archive solution
that will depend on windows journalling to back up some huge repositories.
The particular product fills up containers that  once filled never
change, so the change rate will be low there, but there are also index
files that will change often.

Have you determined whether the memory issue is related to number of files
or number of changes?

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





 Gee, Norman
 norman@lc.ca
 .GOV  To
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager
 ads...@vm.marist Subject
 .EDU Re: [ADSM-L] best backup method for
   millions of small files?

 01/05/2009 07:12
 AM


 Please respond to
 ADSM: Dist Stor
 Manager
 ads...@vm.marist
   .EDU






What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored in filesystem with an incredibly big file address
table
and system should create an inode-like entry for each. If this step
takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is
important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary filesystems with a
limited
number of files. But I think for millions of files, only backup/restore
method that do not care how many files exist in the volume are feasible.
Somehing like pure image backup (like Acronis image incremental backup)
or
the method that FastBack exploites.

Your points are welcomed.

Regards,
Mehdi Salehi


This e-mail (including any attachments) is confidential and may

Re: best backup method for millions of small files?

2009-04-30 Thread Allan Mills
Sorry cannot type today

http://support.microsoft.com/kb/304101

Regards,
Allan

Allan Mills
Technology Operations
02 8835 8035
0422 208 031
- Forwarded by Allan Mills/1115388/Staff/NSWPolice on 01/05/2009 11:35
-

Allan Mills/1115388/Staff/NSWPolice
01/05/2009 10:03

To
ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] best backup method for millions of small files?





Steve

Had this problem last week and found this Microsoft Technet article.
Fixed the problem after a reboot

http://support.microsoft.com/kb/30401


Regards,
Allan

Allan Mills
Technology Operations
02 8835 8035
0422 208 031



Steven Harris sjhar...@au1.ibm.com
Sent by: ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU
01/05/2009 09:39
Please respond to
ADSM: Dist Stor Manager ADSM-L@VM.MARIST.EDU


To
ADSM-L@VM.MARIST.EDU
cc

Subject
Re: [ADSM-L] best backup method for millions of small files?






Hi Norman

Your post worries me, as I'm just implementing an email archive solution
that will depend on windows journalling to back up some huge repositories.
The particular product fills up containers that  once filled never
change, so the change rate will be low there, but there are also index
files that will change often.

Have you determined whether the memory issue is related to number of files
or number of changes?

Regards

Steve

Steven Harris
TSM Admin, Sydney Australia





 Gee, Norman
 norman@lc.ca
 .GOV  To
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc
 Manager
 ads...@vm.marist Subject
 .EDU Re: [ADSM-L] best backup method for
   millions of small files?

 01/05/2009 07:12
 AM


 Please respond to
 ADSM: Dist Stor
 Manager
 ads...@vm.marist
   .EDU






What options are there when journaling runs out of memory on a 32 bit
Windows server?  I have about 10 million files on one server that the
journal engine runs out of memory. With memory efficient disk cache
method and resource utilization 5, its runs out of memory,  resource
utilization of 4 runs too long.

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Huebner,Andy,FORT WORTH,IT
Sent: Thursday, April 30, 2009 8:16 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: best backup method for millions of small files?

You have a disk array copy of the data, is that located close or far?
Have you considered a disk array snap shot also?
If you perform a journaled file system backup and an image backup then
you should be able to restore the image and then update the image with
the file system restore.  This might take a long time, I have never
tried it.
What failure are you trying to protect against?  In our case we use the
disk arrays to protect against a data center loss and a corrupt file
system and a TSM file system backup to protect against the loss of a
file.  Our big ones are in the 10 million file range.  Using a 64bit
Windows server we can backup the file system in about 6 - 8 hours
without journaling.  We suspect we could get the time down to around 4
hours if the TSM server was not busy backing up 500 other nodes.

To me the important thing is to figure out what you are protecting
against with each thing you do.  Also be sure and ask what the Recovery
Point Objective (RPO) is.  If it is less than 24 hours then array based
solutions may be the best choice.  Over 24 hours then TSM may be the
best choice.

Andy Huebner

-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, April 30, 2009 9:39 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

Hi,
None the two methods that you mean in the user's guide are suitable for
my
case. Image+normal incremental that you emphasized in your post means
getting full image backups for example every week. For the incremental
part,
one file-based full backup is needed which is a nightmare for 20
millions.
OK, if I accept the initial incremental backup time (that might take for
days), what happens in restoration?

Naturally, last image backup should be restored first and it will take A
minutes. Provided that image backups are weekly, the progressive
incremental
backups of the week is about 6*20MB=120MB. Now imagine 120MB of 15-20K
files
are to be restored in filesystem with an incredibly big file address
table
and system should create an inode-like entry for each. If this step
takes B
minutes, the total restoration time would be A+B. (A+B/A) ratio is
important
and I will try to measure and share it with the group.

Steven, your solution is excellent for ordinary

best backup method for millions of small files?

2009-04-29 Thread Mehdi Salehi
Hi,
There is an NTFS filesystem in windows with 20 millions of small files less
than 20 KB each. The total size of the filesystem is less than 300GB.
I have not tried incremental backup, because that would be a nightmare.
Again, inc backup is not a good choice because the long restoration time is
not acceptable us.
I tested the snapshot image backup and satisfied with backup time. Now the
problem is how to perform the incremental for this image backup.
Unfortunately incremental-by-date of last image scans the whole files to
filter which files must be in the backup list and would take a tremendous
amount of time. This is not acceptable as well.
The ideal behavior that I expect from TSM is journal based incremental
backup for image which is apparently not supported by TSM 5.3.
more info:
-the change rate for this filesystem is about 20 thousands of files per day
-TSM version is 5.3

What do you recommend for this case? Please include TSM 6.1 featues if
exist.

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-29 Thread Cheung, Richard
Hi there

Try the following:

- install and configure multiple TSM Journal Instances for different
parts of the file system
- run multiple concurrent backup jobs
- run /incrbydate backups during the weeknights, combined with normal
incrementals on weekends






-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, 30 April 2009 1:58 PM
To: ADSM-L@VM.MARIST.EDU
Subject: [ADSM-L] best backup method for millions of small files?

Hi,
There is an NTFS filesystem in windows with 20 millions of small files
less
than 20 KB each. The total size of the filesystem is less than 300GB.
I have not tried incremental backup, because that would be a nightmare.
Again, inc backup is not a good choice because the long restoration time
is
not acceptable us.
I tested the snapshot image backup and satisfied with backup time. Now
the
problem is how to perform the incremental for this image backup.
Unfortunately incremental-by-date of last image scans the whole files
to
filter which files must be in the backup list and would take a
tremendous
amount of time. This is not acceptable as well.
The ideal behavior that I expect from TSM is journal based incremental
backup for image which is apparently not supported by TSM 5.3.
more info:
-the change rate for this filesystem is about 20 thousands of files per
day
-TSM version is 5.3

What do you recommend for this case? Please include TSM 6.1 featues if
exist.

Regards,
Mehdi Salehi

html
body
font face=arial color=#808080 size=-2img
alt=Santos Logo src=http://www.santos.com/library/logo.gif;
brSantos Ltd A.B.N. 80 007 550 923br
Disclaimer: The information contained in this email is intended only for the 
use of the person(s) to whom it is addressed and may be confidential or contain 
privileged information. 
If you are not the intended recipient you are hereby notified that any perusal, 
use, distribution, copying or disclosure is strictly prohibited. 
If you have received this email in error please immediately advise us by return 
email and delete the email without making a copy./font
font face=arial color=#008000 size=-2Please consider the environment 
before printing this email/font
/body
/html


Re: best backup method for millions of small files?

2009-04-29 Thread Mehdi Salehi
I forgot to say that all files are in a single directory!


Re: best backup method for millions of small files?

2009-04-29 Thread Steven Harris
Mehdi


I think you can use journalling.

My copy  of the TSM 5.3 WindowsBackup-Archive Clients Installation and
User’s Guide  has an appendix C describing how to configure the Journal
service.  You also need to look at appendix B for the dsmcutil command,  or
use the GUI wizard.


Regards

Steve

Steven Harris
TSM Admin, Sydney Australia



   
 Mehdi Salehi  
 iranian.aix.supp 
 o...@gmail.com To 
 Sent by: ADSM:   ADSM-L@VM.MARIST.EDU
 Dist Stor  cc 
 Manager  
 ads...@vm.marist Subject 
 .EDU [ADSM-L] best backup method for 
   millions of small files?
   
 30/04/2009 02:28  
 PM
   
   
 Please respond to 
 ADSM: Dist Stor  
 Manager  
 ads...@vm.marist 
   .EDU   
   
   




Hi,
There is an NTFS filesystem in windows with 20 millions of small files less
than 20 KB each. The total size of the filesystem is less than 300GB.
I have not tried incremental backup, because that would be a nightmare.
Again, inc backup is not a good choice because the long restoration time is
not acceptable us.
I tested the snapshot image backup and satisfied with backup time. Now the
problem is how to perform the incremental for this image backup.
Unfortunately incremental-by-date of last image scans the whole files to
filter which files must be in the backup list and would take a tremendous
amount of time. This is not acceptable as well.
The ideal behavior that I expect from TSM is journal based incremental
backup for image which is apparently not supported by TSM 5.3.
more info:
-the change rate for this filesystem is about 20 thousands of files per day
-TSM version is 5.3

What do you recommend for this case? Please include TSM 6.1 featues if
exist.

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-29 Thread Mehdi Salehi
Thanks Steve,
I will test it.

Regards,
Mehdi Salehi


Re: best backup method for millions of small files?

2009-04-29 Thread Cheung, Richard
What are the nightly deltas on these millions of files?   
What do they total in size?
You still would want to run a TSM journal on them in some way... 


-Original Message-
From: ADSM: Dist Stor Manager [mailto:ads...@vm.marist.edu] On Behalf Of
Mehdi Salehi
Sent: Thursday, 30 April 2009 2:26 PM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] best backup method for millions of small files?

I forgot to say that all files are in a single directory!

html
body
font face=arial color=#808080 size=-2img
alt=Santos Logo src=http://www.santos.com/library/logo.gif;
brSantos Ltd A.B.N. 80 007 550 923br
Disclaimer: The information contained in this email is intended only for the 
use of the person(s) to whom it is addressed and may be confidential or contain 
privileged information. 
If you are not the intended recipient you are hereby notified that any perusal, 
use, distribution, copying or disclosure is strictly prohibited. 
If you have received this email in error please immediately advise us by return 
email and delete the email without making a copy./font
font face=arial color=#008000 size=-2Please consider the environment 
before printing this email/font
/body
/html


Re: best backup method for millions of small files?

2009-04-29 Thread Mehdi Salehi
Richard,
The total nightly delta size is about 300MB (less than 20,000 of 20k files)
I am trying to test the journal to verify whether it works with
incremental-by-date for image backups or not. If you have any other
solution, it is welcomed.

Mehdi Salehi