Replies in [[ ]]

IBM Mainframe Discussion List <IBM-MAIN@BAMA.UA.EDU> wrote on 04/12/2006 
02:11:13 PM:

> Are you SURE that your batch job on LPAR 1 is the problem? That is, are 
you
> sure there is not some other factor outside of what you have described 
here?
> Is it possible something else is deleting or renaming C?

[[I own C.  If at any point, there is no C, jobs begin to fail as the one 
on lpar 2 did.  At peak period, jobs can fail every minute.  As for any 
other issues, I am looking for suggestions.]]

> The reason I ask is that it seems to me that even if you totally dropped 
the
> ENQs the situation you described would be unlikely. I don't mean you 
should
> drop the ENQs or that the situation would be sufficiently unlikely for
> mainframe production, I just mean that you would be unlikely to hit it 
in
> any given day, and therefore this must be something else. File N does 
not
> exist for only a fraction of a second (rename to rename) - what's the 
odds
> that a batch job would hit that more than once in a blue moon? (Again, 
too
> often to go that way in production, but I am just doing a mental 
exercise
> here.)

[[The odds are very slim.  And it happens every so often.  It's not a big 
hole I am closing.  It is a hole.  And I can't find the mouse.  I just 
need the name of the mouse so I know whose mouse it is.]]

> After BS689 fails, does C still not exist? Or has it reappeared?

[[In human time the dataset always exists.  Only computers can see the 
hole in time the cybermouse goes through.]]

> Any chance that the rename in your job is failing and you are not 
checking
> return codes correctly?

[[If rename fails, and there is no C, subsequent jobs fail.  If rename 
fails, and there is a C, I could detect that N still existed and that 
dataset timestamps were wrong.]]

> Have you checked timestamps? Have you confirmed that BS689 failed at the
> exact instant that your job was doing the rename?

[[Timestamps the same to the second.  Both steps take less than a second.

It should be mentioned that in the past, to correct this kind of problem, 
rename was limited only to the cluster name of the KSDS since failure 
would happen between the rename of the cluster and the rename of the 
components.

This problem has been happening sporadically for several years.]]


> Charles

> -----Original Message-----
> From: IBM Mainframe Discussion List [mailto:[EMAIL PROTECTED] On 
Behalf
> Of Kirk Talman
> Sent: Wednesday, April 12, 2006 10:41 AM
> To: IBM-MAIN@BAMA.UA.EDU
> Subject: Another fine mess

> Our tech people can't find the source of this problem and I am looking 
for 
> any guesses you might have.

> Sysplex in question has 11 lpars on 5 CECs

> zOS 1.4 with 1.7 being tested.  Sometimes this summer some of the lpars 
in 
> this plex will be running 1.7 if all goes well.  (So far it hasn't.)

> I am told GRS used to propagate ENQ in star configuration.

> Batch job on lpar 1 builds VSAM file called N.  At the end of the build 
> job is a step using a program to frontend IDCAMS.  The frontend issues 
an 
> ENQ RET=NONE with scope of SYSTEMS.  It then links to IDCAMS.  IDCAMS 
> renames C to O and renames N to C, effectively bringing a new version of 
C 
> into play.  When IDCAMS ends, the frontend issues DEQ.

> Meanwhile on lpar 2, a batch job using program called BT689 wants to use 

> file C.  It calls a linked subroutine BS689 which issues an ENQ with the 

> same scope as mentioned above.  The ENQ does not specify RET=NONE, but 
> that is the default.  It does specify RNL=NO, which is not the default. 
> That should not matter since both ENQ have scope of SYSTEMS.

> After it obtains the ENQ, BS689 does SVC 99 to dynamically allocate the 
> file.  BT689 uses the file for a few reads and then ends.

> Previously the job on lpar 1 would fail periodically because the file 
was 
> in use.

> This time the program BT689 on lpar 2 failed with message in its log

> R15=>00000004 S99INFO=>00000002 S99ERROR=>00001708

> IKJ56228I DATA SET C NOT IN CATALOG OR CATALOG CAN NOT BE ACCESSED



-----------------------------------------
The information contained in this communication (including any
attachments hereto) is confidential and is intended solely for the
personal and confidential use of the individual or entity to whom
it is addressed.  The information may also constitute a legally
privileged confidential communication.  If the reader of this
message is not the intended recipient or an agent responsible for
delivering it to the intended recipient, you are hereby notified
that you have received this communication in error and that any
review, dissemination, copying, or unauthorized use of this
information, or the taking of any action in reliance on the
contents of this information is strictly prohibited.  If you have
received this communication in error, please notify us immediately
by e-mail, and delete the original message.  Thank you

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html

Reply via email to