Kastus Shchuka writes: > On Sun, Oct 16, 2022 at 11:48:35AM +0100, cho...@jtan.com wrote: > > So given $X: > > > > $ X=' A : B::D' > > > > Parameter substitution: > > > > $ ( IFS=' :'; dump $X ) > > $VAR1 = 'A'; > > $VAR2 = 'B'; > > $VAR3 = ''; > > $VAR4 = 'D'; > > > > read substitution: > > > > $ echo "$X" | ( IFS=' :'; read a1 a2 a3 a4; dump "$a1" "$a2" "$a3" > > "$a4" ) > > $VAR1 = 'A'; > > $VAR2 = ''; > > $VAR3 = 'B'; > > $VAR4 = ':D'; > > > > It does look like read, which uses its own expansion routine, has > > a bug: a2/VAR2 should be 'B' (or 'B::D') not ''. > > Not sure if it is a bug or a feature.
I can't think of a good reason why the output from these commands should be different. The manpage section describing read states clearly that it should be using the same splitting algorithm: "separates the line into fields using the IFS parameter (see Substitution above)". The diff brings them into line with each other and I think accounts for all the edge cases. Matthew Index: c_sh.c =================================================================== RCS file: /src/datum/openbsd/cvs/src/bin/ksh/c_sh.c,v retrieving revision 1.64 diff -u -p -r1.64 c_sh.c --- c_sh.c 22 May 2020 07:50:07 -0000 1.64 +++ c_sh.c 17 Oct 2022 09:59:21 -0000 @@ -253,6 +253,7 @@ c_read(char **wp) int expand = 1, savehist = 0; int expanding; int ecode = 0; + int hardws = 0; char *cp; int fd = 0; struct shf *shf; @@ -376,9 +377,21 @@ c_read(char **wp) break; if (ctype(c, C_IFS)) { if (Xlength(cs, cp) == 0 && ctype(c, C_IFSWS)) - continue; + continue; /* Trim leading space. */ + if (!ctype(c, C_IFSWS)) { + /* Do not finish this variable + * on non IFS whitespace if the + * previous variable has + * trailing IFS whitespace. + */ + if (hardws) { + hardws = false; + continue; + } + } else + hardws = true; if (wp[1]) - break; + break; /* Finish scanning this variable. */ } Xput(cs, cp, c); }