As of May 3, APAR OA20907 was opened by IBM to provide a temporary fix to the DFSMSdss RESTORE problem in the form of ADRDSSU patch byte that can be set by an installation or on a specific invocation of ADRDSSU to inhibit the reset of the DS1DSCHA bit on "RESTORE FULL" or "RESTORE TRACKS". When available, this should provide a practical workaround for us and others similarly exposed on DFSMSdss RESTORE.

For the long-term solution, IBM is considering as a possible future enhancement something along the lines of, or a variant of, the proposed change in default behavior for DFSMSdss as described in SHARE requirement SSMVSS07009 (currently "open for discussion" on the SHARE site) and a mirrored copy of that requirement as IBM Marketing Request MR0409076057. This future enhancement would possibly include a change in default RESTORE behavior based on use of "RESET" in DUMP, with a new RESTORE option to allow for overriding default behavior.

This has been a topic of continuing discussion for the last month in off-group discussions among IBM and a number of other people representing various SHARE-member installations. With an accepted APAR, it looks like we will get a circumvention in a timely fashion, and most likely there will be some long-term solution that will eliminate the chance that others will get bit in the future.

I again want to thank John Chase for alerting ibm-main to this problem, the various SHARE officers who were instrumental in starting the off-group direct contacts with IBM, and Andrew Wilt of IBM for the progress toward resolving this issue.

Joel C. Ewing
Sr. Technical Admin, Mainframe Systems
Data-Tronics Corp., Fort Smith, AR


Joel C. Ewing wrote:
After four days of experimenting with DSS and thinking about the implications of DOC APAR OA20117 I felt it time to share some additional results and thoughts on this with IBM-MAIN.

First of all, let me re-iterate the basic exposure implied by OA20117:
If you are using DFSMSdss "DUMP FULL" without the "RESET" option -- which is the default usage, indicating the dump is not intended as a replacement for individual dataset dumps -- to save the image of a DASD volume and expecting at some point to use this dump with a "RESTORE FULL" to move the volume to another DASD drive, as part of a Data Center move or migration to new equipment, or for Data Center recovery at a remote site, THEN MOST LIKELY YOU ARE CURRENTLY EXPOSED TO SOME FORM OF DATA-LOSS!

This is true if you are using DFSMShsm with auto-backup enabled (many sites), if you have DFSMShsm FSM (Fast Subsequent Migration) enabled (fewer sites), if you have applications using DFSMSdss that use "BY((DSCHA,EQ,YES))" as part of the selection criteria for data set manipulation, or if you have any other vendor products or home-grown applications in house that manage datasets or process datasets based on the "Changed" bit in the VTOC. If any of these apply to your installation, YOU ARE EXPOSED.

The crux of the problem is that the only practical way to make a physical copy of a volume with DFSMSdss for moving the volume or recovering it elsewhere is with "DUMP FULL" physical dump, and if this is recovered to another device with the obvious counterpart "RESTORE FULL", the result is currently not an identical volume, but a volume with all the VTOC "changed" bits on the volume reset.

This means that future decisions on the recovered system or moved volume that are based on the changed bit will be in error. The effects range from failure to take a required auto-backup (exposing users to data loss when a dataset recovery point they expect to be there is not), to DFHSM erroneously assuming a down-level ML2 version of a dataset is current and scratching the most current version on primary DASD (exposing those using DFSMShsm Fast Subsequent Migration to unpredictable data loss), or failure at some unknown time in the future to select for processing some datasets that should be selected by either 3rd-party vendor products or in-house applications that rely on the "changed" bit. These effects are subtle. Unless some user notices and reports a problem, they can easily be over looked; and if they aren't noticed until six months after the RESTORE, there is little likelihood that DFSMSdss would have been suspected over the more common possibility of "fuzzy user memory" of some kind.

There are two pending requests for a change to this behavior: SHARE request SSMVSS07002, which asks for changing the RESTORE default to not clear the "changed" bits; and Marketing Request MR0302074136, requesting an option on "RESTORE FULL" to control the handling of the "changed" bit. After considerable thought I don't believe either of these is the cleanest or most correct solution. The most consistent solution should be based on the principle that at the completion of a physical volume dump, if you immediately restore that physical dump onto the same device or onto a different device, the default behavior should be that all in-use tracks on the target device, including the VTOC, should be identical with those left on the source device. In other words, an immediate restore on top of the device used to generate the dump shouldn't change anything. This means that the treatment given the "changed" bits should depend on whether "RESET" (which resets all "changed" bits at the end of creating the dump) was specified on the "DUMP FULL". If the dump was created with "DUMP FULL ... RESET", then "RESTORE FULL" should clear all "changed" bits in the VTOC. If the dump was created with "DUMP FULL" without the reset option, then by default "RESTORE FULL" should leave all "changed" bits in the VTOC unaltered. This does not preclude the possibility of also adding a RESTORE option to change this behavior, but my point is that for "least astonishment" the default behavior should be based on whether the DUMP was with or without "RESET".

I have opened an PMR with IBM arguing these points with IBM. The initial response (as one might suspect) is that since the current behavior has been present in DFSMSdss for a l-o-n-g time (despite the fact that no one knew about it until the March 26 DOCS APAR, the end result of John Chase's data-loss PMR), that IBM is disinclined to change it unless there is a clear consensus within the DFSMSdss customer base for a specific "design change" or "enhancement request". Our installation's position is that since our management now knows we have a data-integrity hole from using DFSMSdss in our Disaster Recovery design, they aren't going to be willing to accept as a solution a program enhancement process that may still not be resolved for more than 6 or 12 months, or potentially longer. I plan to create another SHARE request in the near future with my modified proposal just to cover all bases and give SHARE installations another place to cast a vote; but the bottom line is that if you feel as I do that this ought to be handled as a data-loss APAR and resolved sooner than a future enhancement, your installation may need to rattle IBM's cages more directly to make it clear this is not a request coming from just one or two installations.

ONE OTHER COMMENT. If your installation or any application you know of actually depends on the bizarre and prior-to-March-26-undocumented behavior of "RESTORE FULL" resetting the changed bits when restoring from a "DUMP FULL" without "RESET" option, let this group and IBM know. IBM believes you exist, but I have strong doubts.

Now, for some of the more bizarre (and imperfect) circumventions that I have discovered that could be put in place pending a fix to DFSMSdss (if and only if you have no other recourse):

One of the effects that OA20117 doesn't explain is that a RESTORE TRACKS of the VTOC from a "DUMP FULL" dump file, still resets the "changed" bits even though no datasets tracks are restored. However, if you RESTORE TRACKS of the VTOC while restoring no other dataset tracks from a "DUMP TRACKS" that includes the VTOC tracks, the "changed" bits come through unaltered. Another oddity is that although "RESTORE TRACKS" of the VTOC from a "DUMP FULL" dump, clears the changed bits, the dump file itself still apparently contains some indication of which datasets originally had the changed bit on. If you do a "RESTORE DS(BY((DSCHA,EQ,YES))).." from a "DUMP FULL" dump file, then it will actually restore only those datasets that originally had the "changed" bit set, and in agreement with OA20177, since this is a dataset restore, the changed bit will then be left on in the VTOC for those datasets. The above suggests the following two circumventions, each with their own problems;

(1)Backup the volume with DUMP FULL; then first RESTORE the volume with "RESTORE FULL" to pick up all the tracks, including VTOC and IPLTEXT on the volume; Finally, restore just the changed datasets with "RESTORE DS(BY((DSCHA,EQ,YES)))... REPLACE", excluding the VVDS and VTOCIX to avoid "E" level errors, in order to correct the "changed" bits in the VTOC for the datasets that previously had this bit on. The major problem, performance. You have to read the dump file twice, and so correct recovery could easily double your recovery time.

(2)Backup the volume with "DUMP FULL" immediately followed by "DUMP TRACKS" to dump just the VTOC tracks to a separate dump file; restore the volume with "RESTORE FULL" immediately followed by "RESTORE TRACKS" to restore just the VTOC tracks. Problems: the volume-level enqueues will be lost between the two DUMPS and the two RESTORES. If there is any possible way something could slip in and do volume updates in these two gaps you are potentially in deep doo-doo. The other problem is that unless all your VTOC locations and sizes are identical and never change, you will have to have some foolproof, automatic way of maintaining the correct VTOC track extents in your "DUMP TRACKS" and "RESTORE TRACKS" control statements, or again you could get very nasty results that might not be found until you tried to recover unsuccessfully during DR.

An impractical approach attempted and dismissed:
Try to get a single dump file that could be restored with a single restore without changing the "changed" bits in the VTOC. The only possibilities that will handle all VTOC entries and IPLTEXT are RESTORE FULL and RESTORE TRACKS. RESTORE FULL is only compatible with DUMP FULL, and we already know that doesn't work. RESTORE TRACKS will work with both DUMP FULL and DUMP TRACKS dump files, but we already know it fails with DUMP FULL dump. DUMP TRACKS with the max track range would work with a RESTORE TRACKS with max track range, but this will be much more expensive in time and dump file space than DUMP FULL, because residual data in all unused tracks will be dumped as well. One could attempt to analyze the VTOC and dynamically generate the DUMP TRACKS control statement to dump only used tracks, but that makes this into a very difficult problem, and the tracks ranges, which will be different every time the volume is be dumped, must be preserved along with the dump file in order to specify the correct track ranges on the restore. In addition, a RESTORE TRACKS that overwrites track 0 and the VTOC must be targeted to a device that is pre-initialized to have the right volser and a VTOC and VTOCIX identically sized and positioned, or you get nasty messages about an unreadable volume with VTOC errors when a rebuild of the VTOCIX is attempted. This approach has many more problems and risks than either approach (1) or (2).


Chase, John wrote:
-----Original Message-----
From: IBM Mainframe Discussion List On Behalf Of Richards.Bob
Sent: Thursday, December 14, 2006 4:01 PM

No, and I just read it a few hours ago in an attempt to help you. That DEFAULT behavior WAS NOT documented.

The subject DOC APAR closed today, 26 March.

I VIGOROUSLY RECOMMEND anybody who uses DFSMSdss full-volume or
track-range (aka "physical") DUMP / RESTORE for D/R, data movement, etc.
to READ AND UNDERSTAND this DOC APAR.  You **ARE** at risk of losing
data or data integrity via DFSMSdss Full-volume or track-range (aka
"physical") DUMP / RESTORE if you have ANY OTHER PRODUCT that depends
upon the setting of the "change" bit in the Format-1 DSCB.

The DFSMSdss Level 2 rep also submitted Marketing Request #
MR0302074136, requesting that a "switch" (similar to the RESET keyword
on DUMP) be provided to allow the user to specify how DFSMSdss should
handle the "change" bit at RESTORE time.  Those interested should add
their voices via appropriate channels.

SHARE members who have not done so, please vote on Requirement #
SSMVSS07002, which requests a design change to DFSMSdss' default
behavior on full-volume RESTORE.

    -jc-





--
Joel C. Ewing, Fort Smith, AR        [EMAIL PROTECTED]

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to