I had some involvement in Steve Hayes' FTP REXX being made available some
time ago, so I feel somewhat obligated to pass this on.
I discovered a bug in FTP REXX recently that was resulting in stray "0"
characters being left at the end of every line of a data file that we
retrieve from a sendmail server on a daily basis (a mapping of Lotus
Notes-style addresses to Internet-style addresses). The extra 0 was getting
added to one of the possible variants of everyone's localpart and so clearly
needed to be removed. When I discovered it I did not dig into FTP REXX to
determine the cause. I simply threw a "strip trailing f0 1" on the output of
the invocation of the FTP stage (yep, lazy).
Anyway, as I am currently engaged in a project to thoroughly document
several key components of our Xagent email gateway, I happened across the
aforementioned lazy hack and couldn't recall why it was there, so I dredged
all this up again and this time resisted the temptation to be lazy about it.
I narrowed the problem down to this line in the DEBLOCKER subroutine in FTP
REXX:
'| strip trailing string "01" 1' ; /* Discard <EoR> */
The string at the end of every record at this point in the pipeline is
actually "01" (and NOT hex 01). Therefore, the "1" at the end of that strip
trailing removed just the "1" from "01", leaving a stray "0".
I patched the strip trailing string "01" 1 to strip trailing string "01" 2
and all was well.
However, I also felt obligated to run it past Steve, who has retired from
his role as a plumber, but still works at IBM. I thought he'd congratulate
me on my brilliant fix, but instead he took a more general view (always a
good thing IMO) and concluded the entire mechanism (of using the 01 at the
end of the records) was flawed and that if it were done correctly, the bug
would never have occurred. He went further to propose a possible general
fix, but since his plumbing is very rusty and he has no way to test it, he
wasn't too confident it would work. Furthermore, since I got confused trying
to understand his explanation and wasn't confident that I understood it well
enough to adopt it rather than the quick and dirty fix described above, I
suggested posting the topic to this list so that the more-experienced
plumbers here could hash it out. Steve thought that was a good idea.
So, forthwith, my conversation with Steve on this matter. I'll leave it to
y'all to discuss and either validate Steve's thinking or else come up with a
better solution (or not, I can live with the quick and dirty fix if need be,
but of course if someone comes up with a more elegant, general solution that
actually works, I'll happily adopt it).
--
From: Bob Cronin
To: Steve Hayes
Date: 11/04/2011 21:28
Subject: Quick question
Can you think of any reason why a text file I retrieve using your wondrous
FTP stage (from an ASCII-platform box) would come back with every record
having a spurious zero (x'F0') at the end?
I'm documenting some old code and came across an invocation of FTP followed
by a Strip Trailing F0 1 and can't for the life of me remember the story
behind the story ...
Bob Cronin
--
From: Steve Hayes
To: Bob Cronin
Date: 12/04/2011 03:50 AM
Subject: Re: Quick question
Bob, no, sounds weird to me.
What happens if you do the transfer in binary | split at 0A | strip
trailing 0D 1 | xlate a2e
Very, very rusty at pipelines, I'm afraid.
Steve Hayes
--
From: Bob Cronin
To: Steve Hayes
Date: 12/04/2011 15:16
Subject: Wee bug in FTP REXX
I traced FTP REXX and found the problem.
In the DEBLOCKER routine there is this code. In that last line (the strip
trailing), that "1" should be 2 (what is in the data is its actually the
string "01" not the binary value x'01'. If it had been the latter, the "1"
on the strip trailing would have been correct.
Shall I fix this on VMTOOLS?
when mode = 'S' & structure = 'R'
then pipe = pipe , /* Handle xFF escape seqs. */
'| strliteral x00' , /* dummy control byte */
'| deblock linend FF' , /* split before control bytes*/
'| I: if nlocate 1' , /* xFFFF sequence escaped */
'| insert xFF' , /* put FF byte back in hole */
'| I:' , /* back with real records */
'| joincont leading xFF keep' , /* join escaped xFFs together*/
'| change xFFFF xFF' , /* runs of FFs meant 2n-1 */
'| spec 1 c2b 1.2 right write 2-* 3', /* split across records */
'| drop 1' , /* discard original dummy */
'| J: if strfrlabel "1"' , /* from EoF byte */
'| take 1' , /* stop after it */
'| drop 1' , /* discard it */
'| J:' , /* file before EoF here */
'| spec 3-* 1 read 1-2 n' , /* bits with original record */
'| joincont trailing string "00"' , /* No operation -- join them */
'| strip trailing string "01" 1' ; /* Discard <EoR> */
Bob Cronin
--
From: Steve Hayes
To: Bob Cronin
Date: 13/04/2011 04:07 AM
Subject: Re: Wee bug in FTP REXX
Bob
I nearly said "I don't understand it any more, so go ahead", but I couldn't
bring myself to do it. It took me ages to work out what you meant even
though you've clearly pinpointed the bug, so that's how rusty I am. I
obviously took the "maxnum" to refer to copies of stripped string, not
number of characters. So yes, I'd be happy for you to make that change.
However, in ploughing through this, whilst missing the obvious, I think
there is a corner case bugs at the end of file, when it ends in "01" and
does not have a separate <EOR> before the <EOF>, then the (corrected) last
line will strip it (e.g. the file is encoded ASCII ending in X'3031FF11' or
X'3031FF10' but not X'3031FF01FF10'). This is hard to fix cleanly with the
it as written because the code is quite general about what control
characters exist... it ignores bits 7-2 and handles bit 0 and 1 in any
combination; taking X'00' as a No-op, that's not defined in RFC 959. That
was to allow for a degree of future proofing for future uses of the other 6
bits, but it's unlikely that anyone is ever going to do much with record
structures, and RFC 959 does not specify what to do with non-compliant input
data anyway. The tricky bit is how to handle the joincont trailing string
"00" and delete 2 lines not one whilst, losing the "00". That would require
a bit of clever specs 407 stuff that's beyond me these days. But, if we
assume that X'FF00' is treated as X'FF01' (and remember X'FF41' etc. are
treated as X'FF01') then we don't have to worry about that and we can assume
we always have an even number of lines at the end if the stream is properly
terminated. That means we don't have to use that strip trailing string
"01" 2 , which is there to deal with the two possible ways to encode
<EOR><EOF>. What we would get though is a spurious null record at the end
where they are coded separately because we handle x'FF10', the same as
x'FF11': as <EoR><EoF> so we have to zap that before we start.
So, that gives...
when mode = 'S' & structure = 'R'
then pipe = pipe , /* Handle xFF escape seqs. */
'| change xFF01FF10 xFF11' , /* Merge <EoR><EoF> sequence */
, /* to suppress record between*/
'| strliteral x00' , /* dummy control byte */
'| deblock linend FF' , /* split before control bytes*/
'| I: if nlocate 1' , /* handle xFFFF sequence */
'| insert xFF' , /* put FF byte back in hole */
'| I:' , /* back with normal records */
'| joincont leading xFF keep' , /* join xFFs back to data */
'| change xFFFF xFF' , /* fix 2n-1 runs of xFFs */
'| spec 1 c2b 1.2 right write 2-* 3', /* even data records; odd cc*/
'| drop 1' , /* drop dummy: swap even/odd */
'| J: if strfrlabel "1"' , /* from EoF byte */
'| take 1' , /* stop after it */
'| J:' , /* whole file up to EoF here */
'| spec 3-* 1 read' ; /* even data records; odd cc */
Note to self... write more and longer comments in next life!
I don't really have a way to test this though. and it's over a decade since
I've looked at it so I wouldn't trust me with my code if I were me! I'm
fine if you want to go with your fix to just the bug you found and not worry
about regression testing my fix to the corner case bug. But keep a note in
case you ever come across it: files ending in the text characters "01" when
handled in stream mode with record structure will lose those two bytes (I
think).
Got to go now, late for a call; apologies for lack of proof reading.
Steve Hayes