Good diagnostic technique!

It's unclear to me if the monitoring is done from z/VM, or on the PCs in 
the centers 

If it's done a z/VM, then you're only looking at one side of the equation. 
 Can you develop a monitor that runs on the PCs to see what *they* see 
every minute? 

Extra credit will be given if that monitor is also run on the PC 
immediately before and after each FTP event so that what arrives on z/VM 
can be compared with what was on the PC _just before_, and _just after_ 
every FTP event.

My apologies if the thoughts in this have already been tried.  In an 
apparent attempt to maintain some privacy around what's being done, 
sometimes the posts have sometimes been difficult to interpret.

Mike Walter 
Hewitt Associates 
Any opinions expressed herein are mine alone and do not necessarily 
represent the opinions or policies of Hewitt Associates.



"Schuh, Richard" <[EMAIL PROTECTED]> 

Sent by: "The IBM z/VM Operating System" <[email protected]>
12/06/2007 01:38 PM
Please respond to
"The IBM z/VM Operating System" <[email protected]>



To
[email protected]
cc

Subject
Re: FTP Append






I  created a routine that checks the number of records in the files from 
the center in question once per minute and reports any that change to or 
from zero records. It has given surprising results. One time when there 
was a corruption, it reported 0 records immediately before and + records 
after. On other occasions, it has recorded similar changes without there 
being any corruption. Conversely, there have been 2 cases of corruption 
with no indications that the corrupted file was ever empty. I have since 
changed the routine to monitor the files from all centers in an effort to 
see if these state changes are normal. If I see them from the other 
centers, I will have to conclude that they, while strange, are normal.
Back in 2004, I posted an item about files disappearing from SFS when FTP 
was appending to them (
http://listserv.uark.edu/scripts/wa.exe?A2=ind0405&L=IBMVM&P=R29292&D=0&H=0&I=-3&O=T&T=0&m=49139
). There was only one response and the problem went away without ever 
having been correctly diagnosed and fixed. This problem seems to be very 
much the same as the 2004 post because we did note that the files that 
disappeared were first reported as being empty. This time, the problem, if 
it is related, is more persistent than before, happening once every few 
days instead of once every 3.5 yearsL
The question is, what is causing this, something in SFS or is it being 
done by TCPIP? How can I make the determination?
Regards,
Richard Schuh 
Original post in the current thread.
Date:         Wed, 28 Nov 2007 14:12:04 -0800
Reply-To:     The IBM z/VM Operating System <[log in to unmask]>
Sender:       The IBM z/VM Operating System <[log in to unmask]>
From:         "Schuh, Richard" <[log in to unmask]>
Subject:      FTP Append
Content-Type: multipart/alternative;
We have been using FTP to append to daily files from our centers around 
the world for eight years now. The way that we have been doing it is that 
data is accumulated by a PC at each center. When a threshold is reached, 
the PC initiates an FTP session with our VM system and appends the data to 
a file whose name and type reflect the location of the originating system, 
the type of log file and the date of the collection. These files reside in 
the same SFS directory. 
Lately, the files from one of the centers intermittently get corrupted by 
overwriting the already written data. For example, data, which is 
timestamped, might be collected for three hours and after the next 
transmission, the start of the file will bear the timestamp of 03:00:01. 
Sometimes, it happens early; other times late (20:39:00 is one recent 
example). 
The people who are in charge of this process have checked and rechecked to 
verify (1) that all centers are running the same level of software, (2) 
that there is nowhere that the PUT command is used in place of APPEND, and 
(3) any non-zero return code from any command terminates the transmission 
and the error is logged on the PC. So far, no non-zero return code has 
been reported; no error log created.
Has anyone seen this sort of behavior? What might cause it? We have nearly 
20 log files being created on VM using this method and software. Why is 
only one file being victimized?
I have tried FTP to a test file that is locked in XEDIT by a user other 
than the owner of the directory. The result was a meaningful error message 
accompanied by a non-zero return code. Doing the same from the owning user 
gives the expected bad results. The updates of whichever user ends first 
get wiped out by the last to do the FINIS. It is only the update that gets 
wiped out, not the entire file. The latter test was just done for 
completeness of the experiment. In real life, (a) the owner is a service 
machine that runs disconnected and never manipulates these files until 
they are at least a day old, and (b) the only ones who can write into the 
directory are  the owner, the PCs doing the FTPs, which act under the 
auspices of the only user explicitly authorized to write in the directory, 
and file pool administrators.
We are running z/VM 5.2.0 at service level 701 (CP, CMS22, and TCP/IP all 
at the same service level.)




 
The information contained in this e-mail and any accompanying documents may 
contain information that is confidential or otherwise protected from 
disclosure. If you are not the intended recipient of this message, or if this 
message has been addressed to you in error, please immediately alert the sender 
by reply e-mail and then delete this message, including any attachments. Any 
dissemination, distribution or other use of the contents of this message by 
anyone other than the intended recipient is strictly prohibited. All messages 
sent to and from this e-mail address may be monitored as permitted by 
applicable law and regulations to ensure compliance with our internal policies 
and to protect our business. Emails are not secure and cannot be guaranteed to 
be error free as they can be intercepted, amended, lost or destroyed, or 
contain viruses. You are deemed to have accepted these risks if you communicate 
with us by email. 


Reply via email to