First, let me say, that the "mean time between failure" has significantly 
increased over the years.  This reduction in failures have given many a 
shop a false sense of security.  Now it is rare that we see disk failure 
before the system's life has lived it's normal three to four years and is 
retired.    Even then, a single drive is often covered via the RAID 
configuration and easily replaced.

Adrian Merrall wrote an extremely good question/observation:
"Ouch.  I'll risk exposing my ignorance here but are these people not 
opening themselves up to massive data loss in the event of an crash?  If 
the sync times are getting high, running a regular sync out of cron might 
fix this (or completely grind the system to a halt)."

Short answer, correct for massive data loss.  Running a sync more often 
will bring a system to its knees.  Neither solution protects from the 
exposure of an overflow record being added or a dynamic file expanding or 
contracting.

The illusion that a disk subsystem keeps up with your data writes is just 
that - an illusion.  Going off the HW deep end here...

There are as many as three disk caches:
1) the disk cache of the Primary Server managed by the O/S, which the sync 
command will flush to the disk subsystem
2) (this one is optional) some disk subsystems (SANs  & NAS particularly) 
have a memory front end to them
3) there is commonly a 8 or 16 mb hw disk cache on the disk drive itself. 
The writes occur at the HW disk drive level asynchronously to reduce the 
head movement.

Addressing the three:
1) the Primary Server flushes to disk only when necessary.  This is to 
optimize the system.  Dirty disk buffers are flushed to disk constantly 
when required.  Yes, the sync command can assist with this.  But I will 
state again, the number one bottleneck of most U2 systems is the disk 
subsystem and its configuration.  Some systems (as the one mentioned with 
the 10 minutes to do a sync) clearly don't have a disk architecture to 
support the Primary Server.
2) The defacto standard for SAN and NAS manufacturers is to contain a 
battery backup system and to guarantee that whatever is in the memory of 
the SAN or NAS will be written to disk.  Good news.
3) If your disk subsystem is direct attached, many people forget about the 
HW level cache.  Yes, they do have to be written to disk.

Whether that write makes it to the disk is exactly why UDT RFS and UV TL 
with WarmStart are so important.  On both, the write goes first to the 
transaction log IN SYNC, before it writes to the database which occurs 
asynchronously.  The transaction logs for both products also include 
changes in file structures.

At a high level, for both UDT and UV (don't go nit picky here folks - keep 
it at a high level for everyone), there are three writes that have to 
occur when an overflow record is added or a dynamic file expands or 
contracts:
a) the forward pointer of the disk record added
b) the backward pointer back to the original disk record
c) the header of the file

Should any of the three writes not make it to disk, you will have a broken 
file.

Did the three writes make it thru all of the disk caches and written to 
disk?

This is THE REASON that every site should be running UDT RFS and UV TL 
with WarmStart as a minimum.  As you remember from above, the writes for 
the changes for the file structure occur first to the transaction logs for 
both products.  During recovery, both products repair the structure of the 
file, and the data.

Man, really feel like a propeller head after writing that e-mail.    :-)
   Steve

   Stephen M. O'Neal
   Lab Services Sales for U2
   IBM SWG Information Management Lab Services
-------
u2-users mailing list
[email protected]
To unsubscribe please visit http://listserver.u2ug.org/

Reply via email to