Hello,

I've noticed a bug with whitespace indentation in sed.

Summary: For a,i,c `text` the leading whitespace that is intended to
stay in output should be escaped, or else be ignored.  The latter is
not the case for sed(1) - it includes leading whitespace of `text`
in the output, even if it is not escaped.

Test behavior with:
```
1a\
    foo\
\   bar\
    \    baz
```

Actual out:
```
    foo
    bar
        baz
```

Expected out:
```
foo
    bar
    baz
```

Details: in sed command files, whitespace may be inserted before a sed
command.  This is useful for indenting sed code, e.g. commands in a {
function list }.  A question arises for a,i,c commands which have a
`text` argument.  Should the leading whitespace of text be ignored
(just for style), or be part of text?

This dilemma was solved in the 1979 article mentioned in the sed(1)
manpage:
> Note: Within the text put in the output by these functions,
> leading blanks and tabs will disappear, as always in sed commands. To
> get leading blanks and tabs into the output, precede the first
> desired blank or tab by a backslash; the backslash will not appear
> in the output.

This note of McMahon describes an implementation that gives the
ability of indenting the output (necessary), while still allowing
for a complicated sed file to be indented for readability & style.
In the current implementation, the "leading blanks and tabs" will
NOT disappear, thus taking away the ability to indent the sed command
file as desired.

What should be done?

Specification IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008)
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html
> The argument text shall consist of one or more lines. Each embedded
> <newline> in the text shall be preceded by a <backslash>. Other
> <backslash> characters in text shall be removed, and the following
> character shall be treated literally.

...mentions backslashes as though the user would put them in `text`
out of boredom. It doesn't make sense to have to unnecessarily
backslash a backslash if you lost the ability to indent your code -
that's why the complication arose.

Maybe nothing should be done, maybe the few sed files in OpenBSD
source should have backslashes added in a,i,c functions.

Luka

Reply via email to