RE: [U2] Transaction Logging in UniVerse

Stevenson, Charles Fri, 14 May 2004 23:58:42 -0700

> From: [EMAIL PROTECTED]
>
> . . . see who (if anyone) uses
> Transaction Logging, to what extent (Archive, Checkpoint), 
> and whether you use uvbackup or something else such as tar, etc.
> What sort of performance hits . . .


We use UV transaction logging (txlg), both checkpoint & archiving.
We use Legato for tape backups and I wrote an interface that queues full
logs for Legato, then releases them once Legato says they are
successfully to tape.

We generate about 5-6 GB of logs per day, sometimes twice that.

This is a relatively large HP system (HPUX11i, UV 10.0.16) with a hot
standby for failover, automatically doing warmstart recovery if/when the
standby comes live.  We have never had that happen in production, but it
has tested well. I have implemented it elsewhere on Windows & I know of
one instance where it saved the day.

The logs reside on their own file system, configured for synchronous
writes.  Logging is a potential bottleneck but I have been pleasantly
surprised at how little impact it has had.  Only mass update programs
where the bulk of the work is simply writing data (e.g., a month-end job
that simply writes a flag to millions of records) are noticably slower.

We do not do any logical transactions (TRANSACTION START, COMMIT,
ROLLBACK in basic), except for some SQL-based datastage extracts where
it is implicit.  I do not generally recommend attempting to retrofit
legacy code with logical transactions.  I do recommend writing new
systems or subsystems with logical transactions.

This implementation was much more difficult than the previous ones I
have done or been party to.  This list's archives documents most of my
grief if you read back a couple years.  The latest grief was when we did
a recent system upgrade and during load testing found the new system
could write so fast (sustained rate of >1MB/sec), that apparently the
txlg sub-system could not keep up and it crashed!  That is unresolved.
I could faithfully crash it but IBM could not, then suddenly, neither
could I!  I don't know what to say.  In production, I don't think we
will ever generate updates that quickly, but since I did not know for
sure, when we went live on the new system I disabled most files from
logging, then gradually added them back over the course of about a
month.  I still have several ugly application-level cross-reference
files that I am not logging. (You know, the ones that have outgrown the
original design, with megabyte sized vm-delimited arrays of primary ids?
We all have at least one of them.)  Every time one byte in a record
changes, the entire record is logged, so updates to these records
greatly increases log volume.  These xref files are easy to rebuild if
ever we need to.

Estimating your updates can be a challenge.  It is not necessarily the
big files or even the ones that grow the fastest that get logged a lot.
For example, we have a small file with one small record for each of
several phantoms that run continuously.  Each phantom frequently updates
its record with status info.  So update volume is large.  Huge.  But I
do not even log it, since the info is only valid in real-time and if we
ever warmstart, it is worthless.  And the file has no group splits and
merges or overflow restructuring to cause the file to break.

If you have static hashed files, FILE.USAGE can give you stats on update
activity.  (Caution: people on this list have preached against
FILE.USAGE, but I have never been bitten by it.)  I even temporarily
converted a dynamic file to static for a week, just so I could run
FILE.USAGE.

In general, I recommend implementing gradually, activating files for
logging a few at a time and monitoring.  I discovered one nasty program
that needed revision this way.

There are a few confusing elements to tx logging and its tools.  For
example, you do not log native indexes (they automatically get updated
on rollforward of the primary file), yet activating a file, activates
the indexes.  It lets you supposedly activate Distributed files, but
that means nothing: the part files need to be activated individually.
UniAdmin works a little differently from the uv motif menu. Some tools
are lacking. I plan to write an integrity-checking program that verifies
file headers against what the txlg system thinks.  Because txlg info is
in a file's header, you can confuse UV by OS-level copying (just like
confusing it about alternate indexes because of the info in the file
header.) BTW, uv/APP.PROGS has source for much of the sub-system.

I recommend understanding tx logging well, because you will need to know
it under pressure when it comes time to recover from some kind of
disaster.  But consider how much pressure will you be under if you don't
have logging.  I know it saved the day at least once in another shop
where I implemented it on Windows (Sun hdwr).

Checkpointing for warmstart recovery is beautiful.  Besides logging
data, it captures file restructuring (dynamic splits & merges, static
groups expanding or shrinking overflow).  If your system crashes in the
middle of such an operation, it will fix the groups as well as write the
data.  So far our HPs have not crashed in production, but I have test
crashed them and been happy.  I had a hard time making a file break on
demand, though.  And I have not tested a 2nd crash during warmstart
recovery.  If you think about it, that's a likely scenario: once you
have one crash, the probability of a 2nd skyrockets.

That's all I can think of off the top of my head,

Chuck Stevenson
-------
u2-users mailing list
[EMAIL PROTECTED]
http://www.u2ug.org/listinfo/u2-users

RE: [U2] Transaction Logging in UniVerse

Reply via email to