"Kenneth M. Howlett" wrote:
> It seems to me that shoeshining (the tape keeps going back and
> forth, back and forth, back and forth) occurs when the computer
> can not transfer data to or from the tape drive fast enough.

Correct. There's usually two situations where this happens --

  a) Software compression. While modern compression algorithms can keep up
with most current low-end tape drives, most of them are patented, and also
most tape backup programs have older slower compression algorithms within
them. This is one of the things we have addressed in BRU 16.0 -- the new
compression algorithm ought to keep up with a TR-5 or OnStream DI-30 tape
drive on a reasonably fast machine. 

 b) More commonly on high-end tape drives, there is a number of very small
files in a directory. Last time I backed up, I had a Tandberg SLR-50 tape
drive and used its hardware compression (I say "last time" because I swap tape
drives like some folks swap shirts :-). But when I hit directories with a lot
of small files, like /etc or my source tree, I still had occasional shoe-shine
because Linux had to go hunt up the inodes for each of those files then go
hunt up the data blocks. BRU's behavior of putting a header on every file and
padding everything to a tape block boundary didn't help either, since BRU does
a bit more work than 'tar' (slight understatement) to make sure that a bad
tape block doesn't zonk the whole recovery process. 

> especially if the computer is compressing the data. Therefore,
> I think the way to solve the shoeshining problem is to seperate
> the processing from the transferring. 

The basic problem here is dealing with end-of-tape conditions. Tape drives
guarantee that when you write a block and get an end-of-tape condition, you
have enough room left to write an EOD filemark set. That's all they guarantee.
If you issued a 256K write and got an end-of-tape, you may or may not have
gotten all 256K onto the tape. If you issued a 20K write and got an
end-of-tape, on the other hand, you're almost certainly guaranteed to have
gotten everything onto the tape. You can then close the file (which will write
the EOD file marks), and prompt the user for the next tape (or use 'mtx' to
shuffle to the next tape in the autochanger). 

The other problem is that it doesn't matter how much you buffer, if you're
generating data slower than the tape drive can accept it, you'll still have
shoe-shining. But buffering at least reduces the incidence of shoe-shining so
that it occurs only once a minute rather than every 5 seconds. Doing so does
have an appreciable performance benefit -- I cut 20 minutes off a backup of a
4gb hard drive onto a TR-4 tape drive that way. But if you need multiple tapes
for your backup, this approach is dangerous :-(. 

One possibility is a 'buffer' program that actually understands end-of-tape
conditions and tape blocks. This will allow you to handle end-of-tape
conditions while still buffering an appreciable amount of data. I believe BRU
16.0 may include such a buffer program (or maybe not -- it was discussed, but
I'm not involved in BRU 16.0 development so I may be mistaken).  

> Computers are a lot faster than they used to be, while tape
> drives are only a little faster than they used to be. As a
> result, shoeshining is less of a problem than it used to be.

Computers may be a lot faster than they used to be, but most Unix-type
operating systems (heck, most operating systems period!) are still lousy at
handling the situation of trying to back up thousands of small files. Unix
suffers because a) it's still slow to make a system call on most Unix OS's,
meaning that if you issue a read for 58 bytes, it takes the same amount of
time as a read for 8192 bytes, and b) you have to read the inode for each of
those files too, and stat() them so you can put permissions etc. into the
file's header on tape, and that's not fast either on most Unix-type OS's. 

In addition, tape drives are becoming much faster. 4mb/sec media transfer
rates are commonplace amongst high end tape drives, and we've seen one that
does 16mb/sec on a consistent basis -- about as fast as a 9gb Seagate Cheetah
(10,000rpm) hard drive can do on pure streaming thruput!

-- 
Eric Lee Green                         [EMAIL PROTECTED]
Software Engineer                      Visit our Web page:
Enhanced Software Technologies, Inc.   http://www.estinc.com/
(602) 470-1115 voice                   (602) 470-1116 fax

Reply via email to