Re: [hackers] [PATCH][sbase] paste: Allow null delim
> This has nothing to do with getopt(). This is about how the shell parses > and splits words into arguments. -d"" Erf of course, put that on the fever. ^^
Re: [hackers] [PATCH][sbase] paste: Allow null delim
On Thu, Mar 5, 2020, 02:16 Quentin Rameau wrote: > > On Thu, 5 Mar 2020 01:42:37 -0800 > Michael Forney wrote: > > > >> and `-d""` is invalid, since paste(1) must follow the > > >> utility syntax guidelines (guideline 7). > > > > > > Not sure what you mean there, -d"" is the concatenation of -d and > > > '', which is standard. > > > Did you quote the correct guideline? “Guideline 7: Option-arguments > > > should not be optional.” here there's an option-argument, that's an > > > empty string. > > > > I guess it's a combination of 6 and 7. Option arguments must be > > separate parameters and non-optional (unless specified otherwise). > > There is an exception made that allows conforming implementations to > > accept an option adjacent with its option argument in the same > > argument, but I think this only makes sense if the option-argument is > > non-empty. > > > > POSIX says > > > > The construct '\0' is used to mean "no separator" because historical > > versions of paste did not follow the syntax guidelines, and the > > command: > > > > paste -d"" ... > > > > could not be handled properly by getopt(). > > > > I think this implies that `paste -d""` is no longer valid, due to the > > requirements of the syntax guidelines. > > Rather it's due to a getopt() limitation with the (not so) special case > of an empty string option-argument, but I'm not sure why. This has nothing to do with getopt(). This is about how the shell parses and splits words into arguments. -d"" is exactly the same as -d. It is the same word, adding an empty string to the end of a string does not change that original string. There is no argument to -d there. If you wanted to -d"" to pass two arguments "-d" and "" then you'd have to write a new shell to do that. As it stands shells pass on "-d" in argv when they come across -d"".
Re: [hackers] [PATCH][sbase] paste: Allow null delim
On Thu, 5 Mar 2020 01:42:37 -0800 Michael Forney wrote: > On 2020-03-05, Quentin Rameau wrote: > >> Looking at POSIX, I see that `-d '\0'` must be supported, `-d ""` > >> is unspecified > > > > I don't think so, -d "" is just a list with an empty string. > > So -d '\0' is equivalent to -d '', '\0' is here to let the user > > express an empty string in a list, which wouldn't be possible > > otherwise (like how would one specify empty string in a list like > > 'abcd'). > > Here is a direct quote from POSIX: > > The commands: > > paste -d "\0" ... > paste -d "" ... > > are not necessarily equivalent; the latter is not specified by > this volume of POSIX.1-2017 and may result in an error. My bad, I didn't read that far. But, > >> and `-d""` is invalid, since paste(1) must follow the > >> utility syntax guidelines (guideline 7). > > > > Not sure what you mean there, -d"" is the concatenation of -d and > > '', which is standard. > > Did you quote the correct guideline? “Guideline 7: Option-arguments > > should not be optional.” here there's an option-argument, that's an > > empty string. > > I guess it's a combination of 6 and 7. Option arguments must be > separate parameters and non-optional (unless specified otherwise). > There is an exception made that allows conforming implementations to > accept an option adjacent with its option argument in the same > argument, but I think this only makes sense if the option-argument is > non-empty. > > POSIX says > > The construct '\0' is used to mean "no separator" because historical > versions of paste did not follow the syntax guidelines, and the > command: > > paste -d"" ... > > could not be handled properly by getopt(). > > I think this implies that `paste -d""` is no longer valid, due to the > requirements of the syntax guidelines. Rather it's due to a getopt() limitation with the (not so) special case of an empty string option-argument, but I'm not sure why. The utility guidelines permits concatenating options and option-arguments, while it's true that the prefered syntax is to separate them. 12.1, 2.a: “If the SYNOPSIS of a standard utility shows an option with a mandatory option-argument (as with [ -c option_argument] in the example), a conforming application shall use separate arguments for that option and its option-argument. However, a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening characters.” Without commenting on the patch itself, I think it's fine to allow '\0' and '', especially as we don't use getopt(), partly due to it being broken in most implementations (and having such limitations?), and the standards says it's unspecified, not undefinied.
Re: [hackers] [PATCH][sbase] paste: Allow null delim
On 2020-03-05, Quentin Rameau wrote: >> Looking at POSIX, I see that `-d '\0'` must be supported, `-d ""` is >> unspecified > > I don't think so, -d "" is just a list with an empty string. > So -d '\0' is equivalent to -d '', '\0' is here to let the user express > an empty string in a list, which wouldn't be possible otherwise (like > how would one specify empty string in a list like 'abcd'). Here is a direct quote from POSIX: The commands: paste -d "\0" ... paste -d "" ... are not necessarily equivalent; the latter is not specified by this volume of POSIX.1-2017 and may result in an error. >> and `-d""` is invalid, since paste(1) must follow the >> utility syntax guidelines (guideline 7). > > Not sure what you mean there, -d"" is the concatenation of -d and '', > which is standard. > Did you quote the correct guideline? “Guideline 7: Option-arguments > should not be optional.” here there's an option-argument, that's an > empty string. I guess it's a combination of 6 and 7. Option arguments must be separate parameters and non-optional (unless specified otherwise). There is an exception made that allows conforming implementations to accept an option adjacent with its option argument in the same argument, but I think this only makes sense if the option-argument is non-empty. POSIX says The construct '\0' is used to mean "no separator" because historical versions of paste did not follow the syntax guidelines, and the command: paste -d"" ... could not be handled properly by getopt(). I think this implies that `paste -d""` is no longer valid, due to the requirements of the syntax guidelines.
Re: [hackers] [PATCH][sbase] paste: Allow null delim
Hello all, > Looking at POSIX, I see that `-d '\0'` must be supported, `-d ""` is > unspecified I don't think so, -d "" is just a list with an empty string. So -d '\0' is equivalent to -d '', '\0' is here to let the user express an empty string in a list, which wouldn't be possible otherwise (like how would one specify empty string in a list like 'abcd'). > and `-d""` is invalid, since paste(1) must follow the > utility syntax guidelines (guideline 7). Not sure what you mean there, -d"" is the concatenation of -d and '', which is standard. Did you quote the correct guideline? “Guideline 7: Option-arguments should not be optional.” here there's an option-argument, that's an empty string.
Re: [hackers] [PATCH][sbase] paste: Allow null delim
Hi Richard, Thanks for the patch. Can you clarify if this change fixes `-d "\0"`, `-d ""` or `-d""`? Looking at POSIX, I see that `-d '\0'` must be supported, `-d ""` is unspecified, and `-d""` is invalid, since paste(1) must follow the utility syntax guidelines (guideline 7). I recently investigated a similar issue to this in tr(1). I think a proper solution would be to add length parameters to utflen and utftorunestr so that they can handle 0-bytes in the strings. On 2020-03-02, Richard Ipsum wrote: > --- > paste.c | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/paste.c b/paste.c > index b0ac761..0159fe0 100644 > --- a/paste.c > +++ b/paste.c > @@ -53,7 +53,8 @@ nextline: > > for (; efgetrune(, dsc[i].fp, dsc[i].name) ;) { > for (m = last + 1; m < i; m++) > - efputrune(&(delim[m % delimlen]), stdout, > ""); > + if (delim[m % delimlen] != '\0') > + efputrune(&(delim[m % delimlen]), > stdout, ""); > last = i; > if (c == '\n') { > if (i != fdescrlen - 1) > @@ -68,7 +69,8 @@ nextline: > if (i == fdescrlen - 1) > putchar('\n'); > else > - efputrune(, stdout, ""); > + if (d != '\0') > + efputrune(, stdout, ""); > last++; > } > } > @@ -96,7 +98,7 @@ main(int argc, char *argv[]) > seq = 1; > break; > case 'd': > - adelim = EARGF(usage()); > + adelim = ARGF(); I think allowing missing option-argument here breaks POSIX compatibility. > unescape(adelim); > break; > default: > @@ -109,8 +111,12 @@ main(int argc, char *argv[]) > /* populate delimiters */ > /* TODO: fix libutf to accept sizes */ > delim = ereallocarray(NULL, utflen(adelim) + 1, sizeof(*delim)); > - if (!(delimlen = utftorunestr(adelim, delim))) > + if (*adelim == '\0') { > + delimlen = 1; > + *delim = '\0'; > + } else if (!(delimlen = utftorunestr(adelim, delim))) { > usage(); > + } > > /* populate file list */ > dsc = ereallocarray(NULL, argc, sizeof(*dsc)); > -- > 2.25.1
[hackers] [PATCH][sbase] paste: Allow null delim
--- paste.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/paste.c b/paste.c index b0ac761..0159fe0 100644 --- a/paste.c +++ b/paste.c @@ -53,7 +53,8 @@ nextline: for (; efgetrune(, dsc[i].fp, dsc[i].name) ;) { for (m = last + 1; m < i; m++) - efputrune(&(delim[m % delimlen]), stdout, ""); + if (delim[m % delimlen] != '\0') + efputrune(&(delim[m % delimlen]), stdout, ""); last = i; if (c == '\n') { if (i != fdescrlen - 1) @@ -68,7 +69,8 @@ nextline: if (i == fdescrlen - 1) putchar('\n'); else - efputrune(, stdout, ""); + if (d != '\0') + efputrune(, stdout, ""); last++; } } @@ -96,7 +98,7 @@ main(int argc, char *argv[]) seq = 1; break; case 'd': - adelim = EARGF(usage()); + adelim = ARGF(); unescape(adelim); break; default: @@ -109,8 +111,12 @@ main(int argc, char *argv[]) /* populate delimiters */ /* TODO: fix libutf to accept sizes */ delim = ereallocarray(NULL, utflen(adelim) + 1, sizeof(*delim)); - if (!(delimlen = utftorunestr(adelim, delim))) + if (*adelim == '\0') { + delimlen = 1; + *delim = '\0'; + } else if (!(delimlen = utftorunestr(adelim, delim))) { usage(); + } /* populate file list */ dsc = ereallocarray(NULL, argc, sizeof(*dsc)); -- 2.25.1