[ast-developers] [PATCH] Two (2) here document bugs - end of buffer processing

John Wolfe Tue, 02 Nov 2010 15:18:24 -0700

I forgot to complete the registration process before submitting this. I sincerely apologize if this is a

is a duplicate submission.

----------------

While building and testing the ksh93 2010-06-21 release for UnixWare 7.1.4, I have isolated 2 problems in the here_copy() function of sh/lex.c; both related to end-of-buffer

processing.

Actually, the ksh93 regression tests were fine, but as a further test all versions of GNU autoconf from version 2.59 through 2.68 were built and tested using the new ksh93t. The failures in the the autoconf regression tests, while different tests in different versions of autoconf, were all for the same reason. The here document delimiter for a nested here document was getting corrupted when written to the second file; specifically the required <newline> at the end of the delimiter was

not written.

The autoconf "configure" script creates another script (config.status) containing an awk program as a here document. It was noticed that adding 1 character or removing one character from the configure script resulted in a validly terminated awk

program here document in the generated config.status script.

The attached here_doc_bug.tgz file contains a much shortened version of a configure script that contains a nested here document - cf.sh.ORIG. The other script, rep.test.sh, will run the cf.sh script, adding 1 innocuous white space on each iteration, and checking for a valid here document delimiter (_ACAWK) in the generated config.status file. Set MY_SHELL to the ksh binary to be tested at the top of the script.


- It will locate nine failures - the delimiter is corrupted

      6 - ending <newline> is missing and the following line in the file
          has been concatenated onto the delimiter.
      3 - beginning characters of the delimiter are corrupted

- The same  9 failures will occur on 8K file size increments

- The exact same failure points occur with the Solaris x86 ksh83t released by

  AT&T Research in March of 2010.

    ksh.2010-03-09.sol10.i386

Problem #1: lex.c:here_copy() - If an escape character is encountered as the last character of a buffer, the last character of the completed here document may be lost. Character count -1 characters are written to the here document Sfio_t and iop->iosize is incremented accordingly. If the escape character is subsequently written to the here document with "sfputc(sp,'\\');", the iop->iosize is NOT incremented by that 1 character. So while the here document contains all the desired characters, the last character is never written to

          file since iop->size is 1 short.

This problem is reflected by the 6 failures: 9090, 9106, 9112, 9130, 9150 and 9157.

Problem #2: lex.c:here_copy() - If any line begins with the same character(s) as the delimiter of the here document being processed and those character(s) happen at or across a buffer boundary, the buffer up to the beginning of the potential delimiter are written to the here document. If this subsequently is not the current delimiter being processed, "nsave" character need to be written from the END of the current buffer. The current code is actually

          writing "nsave" characters from the beginning of the buffer.

          This problem is reflected by the 3 failures:

                              8365:      "  fAWK" in config.status
                              8366:      "  CAWK" in config.status
                              8367:      " ACAWK" in config.status

where the " f", " " and " " are written from the beginning of the current buffer.

These same problem has existed as far back as ksh93e, although at a different cf.sh file size and

                         recurring on 4096 byte file sizes.

A context diff patch containing the fixes for each problem is also attached.

John Wolfe The SCO Group, Inc.

ksh93/sh/lex.c:  here_copy()    - 1st patch segment below

  When processing an "escape" at the end of a buffer, increment iop->iosize
  accordingly for any character added separately - such as a specific '\\'
  character.

ksh93/sh/lex.c:  here_copy()    - 2nd patch segment below

  When processing a potential delimiter for the current here document and
  an end of a buffer is encountered, update "bufp" to be the beginning of the
  possible delimiter, i.e. "cp" - after doing the write.  If this is the
  current delimiter across the buffer boundary, then there are zero (cp - bufp)
  characters to be written.  If not the target delimiter, "nsav" characters
  from the now updated "bufp" (pointing to the remaining characters) are
  written.

*** src/cmd/ksh93/sh/lex.c.orig Fri May 14 17:23:08 2010
--- src/cmd/ksh93/sh/lex.c      Tue Nov  2 11:39:24 2010
***************
*** 1768,1774 ****
--- 1768,1777 ----
                                if(c==NL)
                                        fcseek(1);
                                else
+                               {
                                        sfputc(sp,'\\');
+                                       iop->iosize++;
+                               }
                        }
                        bufp = fcseek(-1);
                }
***************
*** 1819,1826 ****
                                {
                                        if(!lp->lexd.dolparen && (c=cp-bufp))
                                        {
!                                               if((c=sfwrite(sp,cp=bufp,c))>0)
                                                        iop->iosize+=c;
                                        }
                                        nsave = n;
                                        if((c=lexfill(lp))<=0)
--- 1822,1832 ----
                                {
                                        if(!lp->lexd.dolparen && (c=cp-bufp))
                                        {
!                                               if((c=sfwrite(sp,bufp,c))>0)
!                                               {
!                                                       bufp = cp;
                                                        iop->iosize+=c;
+                                               }
                                        }
                                        nsave = n;
                                        if((c=lexfill(lp))<=0)

here_doc_bug.tgz
Description: Binary data

_______________________________________________
ast-developers mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-developers

[ast-developers] [PATCH] Two (2) here document bugs - end of buffer processing

Reply via email to