Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW - Problem Found/Resovled
Andrew, We finally found the problem - bad drive in the ROBOT. It was not apparent from our tracking tools (so much for those softwares). It took continous batch job abends which raised our suspicion. We had the drive checked and found a defective R/W head. Thanks to all who responded to my post. Andrew N Wilt [EMAIL PROTECTED] wrote: Esmie, I would add the UTILMSG=YES parameter to the backup EXEC statements. This will tell you if DFSMSdss is invoking ICKDSF to init the volumes instead of simply issuing the FCWITHDRAW after the DUMP is completed. If ICKDSF is being invoked, it is because the VTOC tracks of the DASD volume are in a target FlashCopy relationship, and issuing an FCWITHDRAW against them could well cause the volume to be online but invalid (because the VTOC location now contains the residual data from before the FlashCopy was done to it). Thanks, Andrew Wilt IBM DFSMSdss Architecture/Development IBM Mainframe Discussion List wrote on 05/08/2008 08:43:05 AM: [image removed] Question About DFDSS :FCNOCOPY/FCWITHDRAW esmie moo to: IBM-MAIN 05/08/2008 08:46 AM Sent by: IBM Mainframe Discussion List Please respond to IBM Mainframe Discussion List Good Morning Gentle Readers, I am investigating a problem with a backup(backups of SNAP volumes) that is executed daily. For some reason in the last 2 days the backup has been taking 12-15 hours to execute. I covered all angles : changes to the job, Z/OS version, tape mounts/tape drives, scheduling, envirnonment changes etc. Nothing has been changed. I /snip //BACKUP1 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* //INDEV DD UNIT=3390,VOL=SER=SNAP01,DISP=SHR //OUTDEV DD DSN=SYS2.BACKUP1.OUT.SYS001(+1), // DISP=(,CATLG,DELETE), // DCB=GDGDSCB, // UNIT=3490,VOL=(,RETAIN), // LABEL=(01,SL) //SYSIN DD * DUMP FULL INDD(INDEV) OUTDD(OUTDEV) CAN OPT(4) FCWITHDRAW //* //BACKUP2 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* /snip -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html - Looking for the perfect gift? Give the gift of Flickr! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
Esmie, I would add the UTILMSG=YES parameter to the backup EXEC statements. This will tell you if DFSMSdss is invoking ICKDSF to init the volumes instead of simply issuing the FCWITHDRAW after the DUMP is completed. If ICKDSF is being invoked, it is because the VTOC tracks of the DASD volume are in a target FlashCopy relationship, and issuing an FCWITHDRAW against them could well cause the volume to be online but invalid (because the VTOC location now contains the residual data from before the FlashCopy was done to it). Thanks, Andrew Wilt IBM DFSMSdss Architecture/Development IBM Mainframe Discussion List IBM-MAIN@BAMA.UA.EDU wrote on 05/08/2008 08:43:05 AM: [image removed] Question About DFDSS :FCNOCOPY/FCWITHDRAW esmie moo to: IBM-MAIN 05/08/2008 08:46 AM Sent by: IBM Mainframe Discussion List IBM-MAIN@BAMA.UA.EDU Please respond to IBM Mainframe Discussion List IBM-MAIN@BAMA.UA.EDU Good Morning Gentle Readers, I am investigating a problem with a backup(backups of SNAP volumes) that is executed daily. For some reason in the last 2 days the backup has been taking 12-15 hours to execute. I covered all angles : changes to the job, Z/OS version, tape mounts/tape drives, scheduling, envirnonment changes etc. Nothing has been changed. I /snip //BACKUP1 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* //INDEVDD UNIT=3390,VOL=SER=SNAP01,DISP=SHR //OUTDEV DD DSN=SYS2.BACKUP1.OUT.SYS001(+1), //DISP=(,CATLG,DELETE), //DCB=GDGDSCB, //UNIT=3490,VOL=(,RETAIN), //LABEL=(01,SL) //SYSINDD * DUMP FULL INDD(INDEV) OUTDD(OUTDEV) CAN OPT(4) FCWITHDRAW //* //BACKUP2 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* /snip -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Question About DFDSS :FCNOCOPY/FCWITHDRAW
Good Morning Gentle Readers, I am investigating a problem with a backup(backups of SNAP volumes) that is executed daily. For some reason in the last 2 days the backup has been taking 12-15 hours to execute. I covered all angles : changes to the job, Z/OS version, tape mounts/tape drives, scheduling, envirnonment changes etc. Nothing has been changed. I checked with our Capacity Performance group and they couldn't tell what was causing the problem (they are still looking into the problem) however, they advised us that by using the parm FCWITHDRAW, it could be the cause. However, I checked the jobs which have been exeucting over 3 years and it always has this parm and we never had a problem with the job taking long to terminate. Below are the jcls being used to SNAP the volumes and perform the subsequent backup of the vols. Please note I did not include all 30 steps of the backup job. I assure you that they are all the same execept for the VOLSER output dsns. Could someone suggest what other venue I should look at: //STEP1 EXEC PGM=ADRDSSU //SYSPRINT DD SYSOUT=* COPY FULL IDY(SYS001,3390) ODY(SNAP01,3390) DUMPCOND FCNC- ALLDATA(*) ALLEXCP CANCELERROR PURGE COPY FULL IDY(SYS002,3390) ODY(SNAP02,3390) DUMPCOND FCNC- ALLDATA(*) ALLEXCP CANCELERROR PURGE COPY FULL IDY(SYS003,3390) ODY(SNAP03,3390) DUMPCOND FCNC- ALLDATA(*) ALLEXCP CANCELERROR PURGE /* //BACKUP1 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* //INDEVDD UNIT=3390,VOL=SER=SNAP01,DISP=SHR //OUTDEV DD DSN=SYS2.BACKUP1.OUT.SYS001(+1), //DISP=(,CATLG,DELETE), //DCB=GDGDSCB, //UNIT=3490,VOL=(,RETAIN), //LABEL=(01,SL) //SYSINDD * DUMP FULL INDD(INDEV) OUTDD(OUTDEV) CAN OPT(4) FCWITHDRAW //* //BACKUP2 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* //INDEVDD UNIT=3390,VOL=SER=SNAP02,DISP=SHR //OUTDEV DD DSN=SYS2.BACKUP1.OUT.SYS002(+1), //DISP=(,CATLG,DELETE), //DCB=GDGDSCB, //UNIT=3490,VOL=(,RETAIN,REF=*.BACKUP1.OUTDEV), //LABEL=(2,SL) //SYSINDD * DUMP FULL INDD(INDEV) OUTDD(OUTDEV) CAN OPT(4) FCWITHDRAW /* //BACKUP3 EXEC PGM=ADRDSSU,TIME=60 //SYSPRINT DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //DEV1LST DD SYSOUT=* //INDEVDD UNIT=3390,VOL=SER=SNAP03,DISP=SHR //OUTDEV DD DSN=SYS2.BACKUP1.OUT.SYS003(+1), //DISP=(,CATLG,DELETE), //DCB=GDGDSCB, //UNIT=3490,VOL=(,RETAIN,REF=*.BACKUP2.OUTDEV), //LABEL=(3,SL) //SYSINDD * DUMP FULL INDD(INDEV) OUTDD(OUTDEV) CAN OPT(4) FCWITHDRAW /* Thanks in advance to all who respond. - Looking for the perfect gift? Give the gift of Flickr! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
A couple of thoughts. Is it possible that something is going on inside your storage array that is increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? Lizette -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
In a message dated 5/8/2008 11:33:44 A.M. Central Daylight Time, [EMAIL PROTECTED] writes: increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? Without a picture, hard to say. I'd think if it was the controller you'd be getting SIM alerts out the wazoo. Maybe start with EREP and see if we're getting temp or perm errors on any connected pieces. **Wondering what's for Dinner Tonight? Get new twists on family favorites at AOL Food. (http://food.aol.com/dinner-tonight?NCID=aolfod000301) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
As far as I know the tape drive is a ROBOT. According to the device addresses (disk tape) they are separate.The device type for the tape is 359L and disk is 3390. Lizette Koehler [EMAIL PROTECTED] wrote: A couple of thoughts. Is it possible that something is going on inside your storage array that is increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? Lizette -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html - Looking for the perfect gift? Give the gift of Flickr! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
How about the DASD Storage Array? Is it having any issues. The Robot may not be the issue. Just working outside the process (JCL) to see if there are any hardware or response time issues. Sometimes when jobs run long and you find no reasonable reason, then you start looking for the other stuff. Lizette As far as I know the tape drive is a ROBOT. According to the device addresses (disk tape) they are separate.The device type for the tape is 359L and disk is 3390. Lizette Koehler [EMAIL PROTECTED] wrote: A couple of thoughts. Is it possible that something is going on inside your storage array that is increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
Are there specific volumes in the backup that are taking significantly longer to run than before or is it across the board? How long were the backups running before the problem started? Does the FLASHCOPY step run in a miniscule amount of time or did it increase as well? Rex -Original Message- From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of esmie moo Sent: Thursday, May 08, 2008 11:50 AM To: IBM-MAIN@BAMA.UA.EDU Subject: Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW As far as I know the tape drive is a ROBOT. According to the device addresses (disk tape) they are separate.The device type for the tape is 359L and disk is 3390. Lizette Koehler [EMAIL PROTECTED] wrote: A couple of thoughts. Is it possible that something is going on inside your storage array that is increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? Lizette -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
The backups usually would complete within the 2 hour window. Now, it is running longer. The FLASHCOPY step executed the same time i.e. it took the normal 15 seconds to execute. Pommier, Rex R. [EMAIL PROTECTED] wrote: Are there specific volumes in the backup that are taking significantly longer to run than before or is it across the board? How long were the backups running before the problem started? Does the FLASHCOPY step run in a miniscule amount of time or did it increase as well? Rex -Original Message- From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of esmie moo Sent: Thursday, May 08, 2008 11:50 AM To: IBM-MAIN@BAMA.UA.EDU Subject: Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW As far as I know the tape drive is a ROBOT. According to the device addresses (disk tape) they are separate.The device type for the tape is 359L and disk is 3390. Lizette Koehler wrote: A couple of thoughts. Is it possible that something is going on inside your storage array that is increasing your run time? What type of storage array? Are the SNAPxx volumes in the same place as the tape drives? Or are the tape drives and Storage array some distance apart? Lizette -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html - Looking for the perfect gift? Give the gift of Flickr! -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
Channel paths get taken offline accidentally? Are you seeing higher than normal CPU utilization? Lots more EXCP counts? Or is it just wall clock? RMF point to any problems in the I/O subsystem? -Original Message- From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of esmie moo Sent: Thursday, May 08, 2008 12:55 PM To: IBM-MAIN@BAMA.UA.EDU Subject: Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW The backups usually would complete within the 2 hour window. Now, it is running longer. The FLASHCOPY step executed the same time i.e. it took the normal 15 seconds to execute. -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html
Re: Question About DFDSS :FCNOCOPY/FCWITHDRAW
In a message dated 5/8/2008 1:40:37 P.M. Central Daylight Time, [EMAIL PROTECTED] writes: Channel paths get taken offline accidentally? Are you seeing higher than normal CPU utilization? Lots more EXCP counts? Or is it just wall clock? RMF point to any problems in the I/O subsystem? Guess the all inclusive question is 'What's changed?' After blocking out hardware failures via EREP, next is to find the bottleneck via RMF(or equivalent). Upgrades, configuration, scheduling, backups, reorgs all could contribute... **Wondering what's for Dinner Tonight? Get new twists on family favorites at AOL Food. (http://food.aol.com/dinner-tonight?NCID=aolfod000301) -- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html