is a duplicate submission. ----------------
While building and testing the ksh93 2010-06-21 release for UnixWare 7.1.4, I have isolated 2 problems in the here_copy() function of sh/lex.c; both related to end-of-buffer
processing.Actually, the ksh93 regression tests were fine, but as a further test all versions of GNU autoconf from version 2.59 through 2.68 were built and tested using the new ksh93t. The failures in the the autoconf regression tests, while different tests in different versions of autoconf, were all for the same reason. The here document delimiter for a nested here document was getting corrupted when written to the second file; specifically the required <newline> at the end of the delimiter was
not written.The autoconf "configure" script creates another script (config.status) containing an awk program as a here document. It was noticed that adding 1 character or removing one character from the configure script resulted in a validly terminated awk
program here document in the generated config.status script.The attached here_doc_bug.tgz file contains a much shortened version of a configure script that contains a nested here document - cf.sh.ORIG. The other script, rep.test.sh, will run the cf.sh script, adding 1 innocuous white space on each iteration, and checking for a valid here document delimiter (_ACAWK) in the generated config.status file. Set MY_SHELL to the ksh binary to be tested at the top of the script.
- It will locate nine failures - the delimiter is corrupted
6 - ending <newline> is missing and the following line in the file
has been concatenated onto the delimiter.
3 - beginning characters of the delimiter are corrupted
- The same 9 failures will occur on 8K file size increments
- The exact same failure points occur with the Solaris x86 ksh83t
released by
AT&T Research in March of 2010.
ksh.2010-03-09.sol10.i386
Problem #1: lex.c:here_copy() - If an escape character is encountered as
the last
character of a buffer, the last character of the completed
here document
may be lost. Character count -1 characters are written to
the here document
Sfio_t and iop->iosize is incremented accordingly. If the
escape character
is subsequently written to the here document with
"sfputc(sp,'\\');", the
iop->iosize is NOT incremented by that 1 character. So while
the here document
contains all the desired characters, the last character is
never written to
file since iop->size is 1 short.
This problem is reflected by the 6 failures: 9090, 9106,
9112, 9130, 9150 and 9157.
Problem #2: lex.c:here_copy() - If any line begins with the same character(s) as the delimiter of the here document being processed and those character(s) happen at or across a buffer boundary, the buffer up to the beginning of the potential delimiter are written to the here document. If this subsequently is not the current delimiter being processed, "nsave" character need to be written from the END of the current buffer. The current code is actually
writing "nsave" characters from the beginning of the buffer.
This problem is reflected by the 3 failures:
8365: " fAWK" in config.status
8366: " CAWK" in config.status
8367: " ACAWK" in config.status
where the " f", " " and " " are written from
the beginning of the current buffer.
These same problem has existed as far back as ksh93e, although at a different cf.sh file size and
recurring on 4096 byte file sizes.
A context diff patch containing the fixes for each problem is also
attached.
John Wolfe The SCO Group, Inc.
ksh93/sh/lex.c: here_copy() - 1st patch segment below
When processing an "escape" at the end of a buffer, increment iop->iosize
accordingly for any character added separately - such as a specific '\\'
character.
ksh93/sh/lex.c: here_copy() - 2nd patch segment below
When processing a potential delimiter for the current here document and
an end of a buffer is encountered, update "bufp" to be the beginning of the
possible delimiter, i.e. "cp" - after doing the write. If this is the
current delimiter across the buffer boundary, then there are zero (cp - bufp)
characters to be written. If not the target delimiter, "nsav" characters
from the now updated "bufp" (pointing to the remaining characters) are
written.
*** src/cmd/ksh93/sh/lex.c.orig Fri May 14 17:23:08 2010
--- src/cmd/ksh93/sh/lex.c Tue Nov 2 11:39:24 2010
***************
*** 1768,1774 ****
--- 1768,1777 ----
if(c==NL)
fcseek(1);
else
+ {
sfputc(sp,'\\');
+ iop->iosize++;
+ }
}
bufp = fcseek(-1);
}
***************
*** 1819,1826 ****
{
if(!lp->lexd.dolparen && (c=cp-bufp))
{
! if((c=sfwrite(sp,cp=bufp,c))>0)
iop->iosize+=c;
}
nsave = n;
if((c=lexfill(lp))<=0)
--- 1822,1832 ----
{
if(!lp->lexd.dolparen && (c=cp-bufp))
{
! if((c=sfwrite(sp,bufp,c))>0)
! {
! bufp = cp;
iop->iosize+=c;
+ }
}
nsave = n;
if((c=lexfill(lp))<=0)
here_doc_bug.tgz
Description: Binary data
_______________________________________________ ast-developers mailing list [email protected] https://mailman.research.att.com/mailman/listinfo/ast-developers
