Colin,
is it possible, that the files are not closed properly? Try a 'FINIS *
*' fm before you rename the files.
You say that after an IPL CMS the error condition is cleared. this may
be an indicator for a not correctly closed file.
kind regards
Franz Josef Pohlen
Am 15.02.2010 17:54, schrieb Colin Allinson:
This is an issue with the same code as from my previous question but
now I have got something really weird :-
Sorry for the long background explanation but the process that I have
inherited (and hacked around with is as follows) :-
a) Server 1 (5 similar severs looking in different places)
Wakes up every 2 minutes and does the following :-
- looks for logs to upload to the SFS repository. Each log
will be complete since midnight (not just new data).
- creates a user lock file for each log it finds
- uploads the log to a temporary name (in the SFS pool)
- Erases the /fn 1LOG// / file (if it exists)
- Renames the /fn LOG/ file to /fn 1LOG/ (if it exists)
- Renames the tempoary file (just uploaded) to/ fn LOG/
- Removes the User lock file
This complex procedure is because Server 2 might be
examining the file.
b) Server 2 (10 servers, each dealing with different logs)
wakes up every 50 seconds and does the following :-
- Looks in the SFS repository for logs that have more lines
than currently processed
- If there is a user lock file then it ignore the file until
next time
- It scans each line (from the last one processed to the end)
and will take various actions (such as cataloguing tapes used).
- It then records a line pointer for how far it got.
Originally, Server 1 did not mess around with renames but just
overwrote the log in the SFS repository but we were getting a number
of lock ups so I changed it and this has significantly improved the
situation - but we still get the occasional lock up.
The code in Server 2 is very messy, delicate & sensitive so I don't
want to mess with it if I can avoid it.
What we are seeing now is that, occasionally, the Rename of the
temporary file to/ fn LOG/ gives RC=28 with /DMSRND1311E Object
already exists. /In the error handling routine I do a listfile and the
object is not shown. I also do a query locks and get a 'No Locks
held'. Just in case of a timing issue, I then retry the RENAME upto 5
times with a 5 second delay - with no improvement.
The really weird thing is that, once this happens, it will keep
happening for ever & a day until the server is recycled (IPL CMS).
Then the condition gets cleared.
I can understand that there may be a initial timing issue but, as no
SFS locks are shown, I cannot understand what it is that is not being
cleared,
I would be very grateful if anyone can make any suggestions what may
be happening here?
/ /
*Colin Allinson**
VM Systems Support*
Amadeus Data Processing GmbH