On Mon, 2022-01-10 at 00:25 -0600, user wrote:
> The commandD1 regression test can be enabled without modifying sed's source
> code, it is no longer failing. The commandl1, commandl2, and commandc1 tests
> fail because the 'c' and 'D' commands assume that setting the correct length
> of the pattern space is sufficient and do not correct the pattern space.
>
> The pattern space is cleared in the 'c' command by setting ps to a null byte
> and a null byte is copied to the pattern space in the 'D' command to prevent
> stray bytes from being printed by other commands. Specifically, the
> substitute function in pattern.c:408 calls cspace(&SS, s++, 1, APPEND) if s
> doesn't point to a null byte. If the pattern space isn't cleared by the 'c'
> command this will be executed and cause the substitute space to be advanced
> by 1 byte and causes an incorrect output (commandc1 test). If the pattern
> space isn't cleared in the 'c' command lputs (called by the l command) will
> also print the pattern space without checking the pattern space length
> (commandl2 test), giving an incorrect output. In the 'D' command, the
> original memmove did not copy the null byte at the end of the bytes that were
> moved, causing a lputs call following the command to give an incorrect output
> (commandl1 test).
>
> Not sure if the diff is correct since it was generated using got diff.
Thanks for the detailed analysis. However, I think we should fix the
cases where SPACE.len is not honoured...
The lputs case is fairly straight forward and I'd like to get an OK
for that part.
The substitute case I haven't convinced myself just yet that it's
correct. But (all) regress passes and I want to put it out here in
case someone with more spare brain cycles than me wants to dive into it.
martijn@
Index: process.c
===================================================================
RCS file: /cvs/src/usr.bin/sed/process.c,v
retrieving revision 1.34
diff -u -p -r1.34 process.c
--- process.c 14 Nov 2018 10:59:33 -0000 1.34
+++ process.c 10 Jan 2022 14:20:56 -0000
@@ -60,7 +60,7 @@ static SPACE HS, PS, SS;
static inline int applies(struct s_command *);
static void flush_appends(void);
-static void lputs(char *);
+static void lputs(char *, size_t);
static inline int regexec_e(regex_t *, const char *, int, int, size_t,
size_t);
static void regsub(SPACE *, char *, char *);
@@ -158,7 +158,7 @@ redirect:
(void)fprintf(outfile, "%s", cp->t);
break;
case 'l':
- lputs(ps);
+ lputs(ps, psl);
break;
case 'n':
if (!nflag && !pd)
@@ -390,11 +390,11 @@ substitute(struct s_command *cp)
* and at the end of the line, terminate.
*/
if (match[0].rm_so == match[0].rm_eo) {
- if (*s == '\0' || *s == '\n')
+ if (slen == 0 || *s == '\0' || *s == '\n')
slen = -1;
else
slen--;
- if (*s != '\0') {
+ if (slen >= 0 && *s != '\0') {
cspace(&SS, s++, 1, APPEND);
le++;
}
@@ -478,34 +478,35 @@ flush_appends(void)
}
static void
-lputs(char *s)
+lputs(char *s, size_t l)
{
int count;
extern int termwidth;
const char *escapes;
+ size_t i;
char *p;
- for (count = 0; *s; ++s) {
+ for (i = 0, count = 0; i < l; i++) {
if (count >= termwidth) {
(void)fprintf(outfile, "\\\n");
count = 0;
}
- if (isascii((unsigned char)*s) && isprint((unsigned char)*s)
- && *s != '\\') {
- (void)fputc(*s, outfile);
+ if (isascii((unsigned char)s[i]) && isprint((unsigned char)s[i])
+ && s[i] != '\\') {
+ (void)fputc(s[i], outfile);
count++;
- } else if (*s == '\n') {
+ } else if (s[i] == '\n') {
(void)fputc('$', outfile);
(void)fputc('\n', outfile);
count = 0;
} else {
escapes = "\\\a\b\f\r\t\v";
(void)fputc('\\', outfile);
- if ((p = strchr(escapes, *s))) {
+ if ((p = strchr(escapes, s[i])) && s[i] != '\0') {
(void)fputc("\\abfrtv"[p - escapes], outfile);
count += 2;
} else {
- (void)fprintf(outfile, "%03o", *(u_char *)s);
+ (void)fprintf(outfile, "%03o", (u_char)s[i]);
count += 4;
}
}