you could also export only inactive versions, René Lambelet Nestec S.A. / Informatique du Centre 55, av. Nestlé CH-1800 Vevey (Switzerland) *+41'21'924'35'43 7+41'21'924'28'88 * K4-117 email [EMAIL PROTECTED] Visit our site: http://www.nestle.com This message is intended only for the use of the addressee and may contain information that is privileged and confidential. > -----Original Message----- > From: bbullock [SMTP:[EMAIL PROTECTED]] > Sent: Monday, February 26, 2001 4:32 PM > To: [EMAIL PROTECTED] > Subject: Re: Performance Large Files vs. Small Files > > EEK. I'm sure this is not the answer, because if I rename the > filesystem everyday, then I have to do a full backup of the filesystem > every day. I don't think there are enough hours in the day to do a full > backup, export it, and delete the filespace. Thanks for the suggestion > though. > > Ben Bullock > UNIX Systems Manager > (208) 368-4287 > > > -----Original Message----- > > From: Lambelet,Rene,VEVEY,FC-SIL/INF. > > [mailto:[EMAIL PROTECTED]] > > Sent: Monday, February 26, 2001 1:13 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Performance Large Files vs. Small Files > > > > > > Hello, > > > > > > you might think of renaming the node every day, then doing an export > > followed by a delete of the filespace (this will free the DB). > > > > In case of restore, import the needed node > > > > René Lambelet > > Nestec S.A. / Informatique du Centre > > 55, av. Nestlé CH-1800 Vevey (Switzerland) > > *+41'21'924'35'43 7+41'21'924'28'88 * K4-117 > > email [EMAIL PROTECTED] > > Visit our site: http://www.nestle.com > > > > This message is intended only for the use of the > > addressee and > > may contain information that is privileged and confidential. > > > > > > > -----Original Message----- > > > From: bbullock [SMTP:[EMAIL PROTECTED]] > > > Sent: Tuesday, February 20, 2001 11:22 PM > > > To: [EMAIL PROTECTED] > > > Subject: Re: Performance Large Files vs. Small Files > > > > > > Jeff, > > > You hit the nail on the head of what is the biggest > > problem I face > > > with TSM today. Excuse me for being long winded, but let me > > explain the > > > boat > > > I'm in, and how it relates to many small files. > > > > > > We have been using TSM for about 5 years at our > > company and have > > > finally got everyone on our band wagon and away from the variety of > > > backup > > > solutions and media we had in the past. We now have 8 TSM > > servers running > > > on > > > AIX hosts (S80s) attached to 4 libraries with a total of 44 > > 3590E tape > > > drives. A nice beefy environment. > > > > > > The problem that keeps me awake at night now is > > that we now have > > > manufacturing machines wanting to use TSM for their > > backups. In the past > > > they have used small DLT libraries locally attached to the host, but > > > that's > > > labor intensive and they want to take advantage of our > > "enterprise backup > > > solution". A great coup for my job security and TSM, as > > they now see the > > > benefit of TSM. > > > > > > The problem with these hosts is that they generate > > many, many > > > small > > > files every day. Without going into any detail, each file > > is a test on a > > > part that they may need to look at if the part ever fails. > > Each part gets > > > many tests done to it through the manufacturing process, so > > many files are > > > generated for each part. > > > > > > How many files? Well, I have one Solaris-based host > > that generates > > > 500,000 new files a day in a deeply nested directory > > structure (about 10 > > > levels deep with only about 5 files per directory). Before > > I am asked, > > > "no, > > > they are not able to change the directory of file structure > > on the host. > > > It > > > runs proprietary applications that can't be altered". They > > are currently > > > keeping these files on the host for about 30 days and then > > deleting them. > > > > > > I have no problem moving the files to TSM on a > > nightly basis, we > > > have a nice big network pipe and the files are small. The > > problem is with > > > the TSM database growth, and the number of files per > > filesystem (stored in > > > TSM). Unfortunately, the directories are not shown when you > > do a 'q occ' > > > on > > > a node, so there is actually a "hidden" number of database > > entries that > > > are > > > taking up space in my TSM database that are not readily > > apparent when > > > looking at the output of "q node". > > > > > > One of my TSM databases is growing by about 1.5 GB > > a week, with no > > > end in sight. We currently are keeping those files for 180 > > days, but they > > > are now requesting that them be kept for 5 years (in case a > > part gets > > > returned by a customer). > > > > > > This one nightmare host now has over 20 million > > files (and an > > > unknown number of directories) across 10 filesystems. We > > have found from > > > experience, that any more than about 500,000 files in any > > filesystem means > > > a > > > full filesystem restore would take many hours. Just to restore the > > > directory > > > structure seems to take a few hours at least. I have told > > the admins of > > > this > > > host that it is very much unrecoverable in it's current > > state, and would > > > take on the order of days to restore the whole box. > > > > > > They are disappointed that an "enterprise backup > > solution" can't > > > handle this number of files any better. They are willing to > > work with us > > > to > > > get a solution that will both cover the daily "disaster > > recovery" backup > > > need for the host and the long term retentions they desire. > > > > > > I am pushing back and telling them that their > > desire to keep it > > > all > > > for 5 years is unreasonable, but thought I'd bounce it off > > you folks to > > > see > > > if there was some TSM solution that I was overlooking. > > > > > > There are 2 ways to control database growth: reduce > > the number of > > > database entries, or reduce the retention time. > > > > > > Here is what I've looked into so far. > > > > > > 1. Cut the incremental backup retention down to 30 days and > > then generate > > > a > > > backup set every 30 days for long term retention. > > > On paper it looks good. you don't have to move the > > data over the > > > net > > > again and there is only 1 database entry. Well, I'm not > > sure how many of > > > you > > > have tried this on a filesystem with many files, but I > > tried it twice on a > > > filesystem with only 20,000 files and it took over 1 hour > > to complete. > > > Doing > > > the math it would take over 100 hours to do each of these 2 > > million-file > > > filesystems. Doesn't seem really feasible. > > > > > > 2. Cut the incremental backup retention down to 30 days and run and > > > archive > > > every 30 days to the 5 year management class. > > > This would cut down the number of files we are > > tracking with the > > > incrementals, so a full filesystem restore from the latest > > backup would > > > have > > > less garbage to sort through and hopefully run quicker. Yet with the > > > archives, we would have to move the 600 GB over the net > > every 30 days and > > > would still end up tracking the millions of individual > > files for the next > > > 5 > > > years. > > > > > > 3. Use TSM as a disaster recovery solution with a short 30 > > day retention, > > > and use some other solution (like a local CD/DVD burner) to > > get the 5 year > > > retention they desire. Still looking into this one, but > > they don't like it > > > because it once again becomes a manual process to swap out CDs. > > > > > > 4. Use TSM as a disaster recovery solution (with a short 30 > > day retention) > > > and have a process tar up all the 30-day old files into one > > large file, > > > then > > > have TSM do an archive and delete .tar file. This would > > mean we only track > > > 1 > > > large tar file for every day for the 5 year time (about > > 1800 files). This > > > is > > > the option we are currently pursuing. > > > > > > Any other options or suggestions from the group? > > Any other backup > > > solutions you have in place for tracking many files over > > longer periods of > > > time? > > > > > > If you made it this far through this long e-mail, thanks for > > > letting > > > me drone on. > > > > > > Thanks, > > > Ben Bullock > > > UNIX Systems Manager > > > Micron Technology > > > > > > > > > > -----Original Message----- > > > > From: Jeff Connor [mailto:[EMAIL PROTECTED]] > > > > Sent: Thursday, February 15, 2001 12:01 PM > > > > To: [EMAIL PROTECTED] > > > > Subject: Re: Performance Large Files vs. Small Files > > > > > > > > > > > > Diana, > > > > > > > > Sorry to chime in late on this but you've hit a subject I've been > > > > struggling with for quite some time. > > > > > > > > We have some pretty large Windows NT file and print servers > > > > using MSCS. > > > > Each server has lots of small files(1.5 to 2.5 million) > > and total disk > > > > space(the D: drive) between 150GB and 200GB, Compaq server, > > > > two 400mhz xeon > > > > with 400MB ram. We have been running TSM on the > > mainframe since ADSM > > > > version 1 and are currently at 3.7 of the TSM server with > > 3.7.2.01 and > > > > 4.1.2 on the NT clients. > > > > > > > > Our Windows NT admins have had a concern for quite some time > > > > regarding TSM > > > > restore performance and how long it would take to restore > > > > that big old D: > > > > drive. They don't see the value in TSM as a whole as > > compared to the > > > > competition they just want to know how fast can you recover > > > > my entire D: > > > > drive. They decided they wanted to perform weekly full > > > > backups to direct > > > > attached DLT drives using Arcserve and would use the TSM > > > > incrementals to > > > > forward recover during full volume restore. We had to > > > > finally recover one > > > > of those big D: drives this past September. The Arcserve > > > > portion of the > > > > recovery took about 10 hours if I recall correctly. The > > TSM forward > > > > recovery ran for 36 hours and only restored about 8.5GB. > > > > They were not > > > > pleased. It seems all that comparing took quite some time. > > > > I've been > > > > trying to get to the root of the bottleneck since then. I've > > > > worked with > > > > support on and off over the last few months performing > > > > various traces and > > > > the like. At this point we are looking in the area of > > > > mainframe TCPIP and > > > > delay's in acknowledgments coming out of the mainframe during test > > > > restores. > > > > > > > > If you've worked with TSM for a number of years and > > through sources in > > > > IBM/Tivoli and the valuable information from this listserv, > > > > over time you > > > > learn about all the TSM client and server "knobs" to turn to > > > > try and get > > > > maximum performance. Things like Bufpoolsize, database > > cache hits, > > > > housekeeping processes running at the same time as > > > > backups/restores slowing > > > > things down, network issues like auto-negotiate on NIC's, MTU > > > > sizes, TSM > > > > server database and log disk placement, tape drive > > load/seek times and > > > > speeds and feeds. Basically, I think we are pretty well set > > > > with all those > > > > important things to consider. This problem we are having may be a > > > > mainframe TCPIP issue in the end, but I am not sure that > > will be the > > > > complete picture. > > > > > > > > We have recently installed an AIX TSM server, H80 two-way, > > > > 2GB memory, > > > > 380GB EMC 3430 disk, 6 Fibre Channel 3590-E1A drives in a > > > > 3494, TSM server > > > > at 4.1.2. We plan to move most of the larger clients from > > > > the TSM OS/390 > > > > server to the AIX TSM server. A good move to realize a > > performance > > > > improvement according to many posts on this Listserv over the > > > > years. I am > > > > in the process of testing my NT "problem children" as quickly > > > > as I can to > > > > prove this configuration will address the concerns our NT > > > > Admins have about > > > > restores of large NT servers. I'm trying to prevent them > > > > from installing a > > > > Veritas SAN solution and asking them to stick with our > > > > Enterprise Backup > > > > Strategic direction which is to utilize TSM. As you probably > > > > know, the SAN > > > > enabled TSM backup/archive client for NT is not here and may > > > > never be from > > > > what I've heard. My only option at this point is SAN tape > > > > library sharing > > > > with the TSM client and server on the same machine for each > > > > of our MSCS > > > > servers. > > > > > > > > Now I'm sure many of you reading this may be thinking of > > > > things like, "why > > > > not break the D: drive into smaller partitions so you can > > collocate by > > > > filespace and restore all the data concurrently". No go > > > > guys, they don't > > > > want to change the way they configure their servers just to > > > > accommodate TSM > > > > when the feel they would not have to with other products. > > > > They feel that > > > > with 144GB single drives around the corner who is to say what > > > > a "big" NT > > > > partition is? NT seems to support these large drives > > without issues. > > > > (Their words not mine). > > > > > > > > Back to the issue. Our initial backup tests using our new > > > > AIX TSM server > > > > have produced significant improvements in performance. I am > > > > just getting > > > > the pieces in place to perform restore tests. My first test > > > > a couple days > > > > ago was to restore part of the data from that server we had > > > > the issue with > > > > in September. It took about one hour to lay down just > > the directories > > > > before restoring any files. Probably still better than the > > > > mainframe but > > > > not great. My plan for future tests is to perform backups > > > > and restores of > > > > the same data to and from both of my TSM servers to compare > > > > performance. I > > > > will share the results with you and the rest of the listserv > > > > as I progress. > > > > > > > > In general I have always, like many other TSM users, achieved > > > > much better > > > > restore/backup rates with larger files versus lots of > > smaller files. > > > > Assuming you've done all the right tuning, the question that > > > > comes to my > > > > mind is, does it really come down to the architecture? The > > > > TSM database > > > > makes things very easy for day to day smaller recoveries > > > > which is the type > > > > we perform most. But does the architecture that makes day to day > > > > operations easier not lend itself well to backup/recovery of > > > > large amounts > > > > of data made up of small files? I have very little > > experience with > > > > competing products. Do they struggle with lots of small > > files as well? > > > > Veritas, Arserve anyone? If the issue is, as some on the > > > > Listserv have > > > > suggested, frequent interaction with the client file system > > > > the bottleneck, > > > > then I suppose the answer would be yes the other products > > > > have the same > > > > problem. Or is the issue more on the TSM database side due > > > > to it's design, > > > > and other products using different architectures may not have > > > > this problem? > > > > Maybe the competitions architecture is less bulletproof but > > > > if you're one > > > > of our NT Admins you don't seem to care when the client > > keeps calling > > > > asking how much longer the restore will be running. I know TSM > > > > development is aware of the issues with lots of small files > > > > and I would be > > > > curious what they plan to do about the problems Diana and I have > > > > experienced. > > > > > > > > The newer client option, Resourceutilization, has helped with > > > > backing up > > > > clients with lots of small files more quickly. I would love > > > > to see the > > > > same type of automated multi-tasking on restores. I > > don't know the > > > > specifics of how this actually works but it seems to me that > > > > when I ask to > > > > restore an entire NT drive, for example, the TSM > > > > client/server must sort > > > > the file list in some fashion to intelligently request > > tape volumes to > > > > minimize the mounts required. If that's the case could they > > > > take things > > > > one step further and add an option to the restore specifying > > > > the number of > > > > concurrent sessions/mountpoints to be used to perform the > > > > restore? For > > > > example, if I have a node who's collocated data is spread > > > > across twenty > > > > tapes and I have 6 tape drives available for the recovery, > > > > how about an > > > > option for the restore command like: > > > > > > > > RES -subd=y -nummp=6 d:\* > > > > > > > > where the -nummp option would be the number of mount > > > > points/tape drives to > > > > be used for the restore. TSM could sort the file list coming > > > > up with the > > > > list of tapes to be used for the restore and perhaps > > spread the mounts > > > > across 6 sessions/mount points. I'm sure I've probably made > > > > a complex task > > > > sound simple but this type of option would be very useful. I > > > > think many of > > > > us have seen the benefits of running multiple sessions to > > > > reduce recovery > > > > elapsed time. I find my current choices for doing so difficult to > > > > implement or politically undesirable. > > > > > > > > If others have the same issues with lots of small files in > > > > particular with > > > > Windows NT clients lets hear from you. Maybe we can come > > up with some > > > > enhancement requests. I'll pass on the results of my > > tests as stated > > > > above. I'd be interested in hearing from those of you that > > > > have worked > > > > with other products and can tell me if they have the same > > performance > > > > problems with lots of small files. If the performance of > > > > other products is > > > > impacted in the same was as TSM performance then that > > would be good to > > > > know. If it's more about the Windows NT NTFS file system > > then I'd be > > > > satisfied with that explanation as well. If it's about lots > > > > of interaction > > > > with the TSM database leads to slower performance, even > > when optimally > > > > configured, then I'd like to know what Tivoli has in the > > > > works to address > > > > the issue. Because if it's the TSM database, I could > > > > probably install the > > > > fattest Fibre Channel/network pipe with the fastest > > > > peripherals and server > > > > hardware around and it might not change a thing. > > > > > > > > Thanks > > > > Jeff Connor > > > > Niagara Mohawk Power Corp. > > > > > > > > > > > > > > > > > > > > > > > > > > > > "Diana J.Cline" <[EMAIL PROTECTED]>@VM.MARIST.EDU> on > > > > 02/14/2001 10:04:52 AM > > > > > > > > Please respond to "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> > > > > > > > > Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> > > > > > > > > > > > > To: [EMAIL PROTECTED] > > > > cc: > > > > > > > > Subject: Performance Large Files vs. Small Files > > > > > > > > > > > > Using an NT Client and an AIX Server > > > > > > > > Does anyone have a TECHNICAL reason why I can backup 30GB of > > > > 2GB files that > > > > are > > > > stored in one directory so much faster than 30GB of 2kb > > files that are > > > > stored > > > > in a bunch of directories? > > > > > > > > I know that this is the case, I just would like to find out > > > > why. If the > > > > amount > > > > of data is the same and the Network Data Transfer Rate is the > > > > same between > > > > the > > > > two backups, why does it take the TSM server so much longer > > > > to process the > > > > files being sent by the larger amount of files in multiple > > > > directories? > > > > > > > > I sure would like to have the answer to this. We are trying > > > > to complete an > > > > incremental backup an NT Server with about 3 million small objects > > > > (according > > > > to TSM) in many, many folders and it can't even get done in > > > > 12 hours. The > > > > actual amount of data transferred is only about 7GB per > > > > night. We have > > > > other > > > > backups that can complete 50GB in 5 hours but they are in one > > > > directory and > > > > the > > > > # of files is smaller. > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > Network data transfer rate > > > > -------------------------- > > > > The average rate at which the network transfers data between > > > > the TSM client and the TSM server, calculated by dividing the > > > > total number of bytes transferred by the time to transfer the > > > > data over the network. The time it takes for TSM to process > > > > objects is not included in the network transfer rate. Therefore, > > > > the network transfer rate is higher than the aggregate transfer > > > > rate. > > > > . > > > > Aggregate data transfer rate > > > > ---------------------------- > > > > The average rate at which TSM and the network transfer data > > > > between the TSM client and the TSM server, calculated by > > > > dividing the total number of bytes transferred by the time > > > > that elapses from the beginning to the end of the process. > > > > Both TSM processing and network time are included in the > > > > aggregate transfer rate. Therefore, the aggregate transfer > > > > rate is lower than the network transfer rate. > > > > > >
Re: Performance Large Files vs. Small Files
Lambelet,Rene,VEVEY,FC-SIL/INF. Mon, 26 Feb 2001 07:41:21 -0800
- Re: Performance Large Files vs. Small File... Stephen Mackereth
- Re: Performance Large Files vs. Small File... Steve Harris
- Re: Performance Large Files vs. Small File... bbullock
- Re: Performance Large Files vs. Small... Petr Prerost
- Re: Performance Large Files vs. Small File... bbullock
- TSM Pricing [was Re: Performance Larg... Thomas A. La Porte
- Re: TSM Pricing [was Re: Performa... Kelly J. Lipp
- Re: Performance Large Files vs. Small File... bbullock
- Re: Performance Large Files vs. Small File... Lambelet,Rene,VEVEY,FC-SIL/INF.
- Re: Performance Large Files vs. Small File... bbullock
- Lambelet,Rene,VEVEY,FC-SIL/INF.
