Attached a patch to discard null-bytes while read, this preserves the functionality Greg demonstrated (not sure if this is desirable ...) wherein a delim of '' (e.g. -d '') will split on null byte.
With patch, read functions this way: bash-4.2$ printf 'foo\0bar\n' | while read line; do echo "$line"; done foobar bash-4.2$ printf 'foo\0bar\0' | while read -d '' line; do echo "$line"; done foo bar I find this behavior incongruent with what I expect from setting things like IFS to empty string (e.g. delim is every character), but it seems like it is already in use. I have a patch to make terminate input line after every character for -d '', and after null-byte on -d '\0', if you are interested in that functionality, I'll send that patch for your consideration as well. git am patch for read builtin
0001-Strip-null-bytes-from-read-when-DELIM-is-not.patch.gz
Description: GNU Zip compressed data
git am patch for man page and texi
0002-Update-documentation-both-man-and-info-to-reflect-re.patch.gz
Description: GNU Zip compressed data
I have patches for the generated documentation, but they are quite large, if you want them I'm happy to send them along as well. cheers, -matt On Nov 24, 2011, at 12:08 AM, Chet Ramey wrote: > On 11/23/11 9:44 PM, Matthew Story wrote: >> >> On Nov 23, 2011, at 7:09 PM, Chet Ramey wrote: >> >>> On 11/23/11 6:54 PM, Matthew Story wrote: >>>> On Nov 23, 2011, at 4:47 PM, Chet Ramey wrote: >>>> >>>>> On 11/23/11 9:03 AM, Matthew Story wrote: >>>>>> [... snip] >>> >>> Yes, sorry. That's what the "bash treats the line read as a C string" >>> was intended to imply. Since the line read is a C string, the NUL >>> terminates it and what remains is assigned to the named variables. I >>> should have used `line' in my explanation instead of `foo'. >> >> I understand that the underlying implementation of the bash builtins is >> `C', and I understand that `C' stings are NUL terminated. It seems >> unreasonable to me to expect understanding of this implementation detail >> when using bash to read streams into variables via the `read' builtin. > > I took a look around at Posix and some other shells. Posix passes on > the issue completely: the input to read may not contain NUL bytes at all. > The Bourne shell, from v7 to SVR4.2, uses NUL as a line terminator. Other > shells, including ksh93, ash and pdksh derivatives like dash and mksh, > discard NUL bytes in read. zsh doesn't discard NULs and handles them > pretty well, putting them into a variable's value. > > The discard behavior seems fairly standard, and I will look at putting it > into the next version of bash. > > Chet > > -- > ``The lyf so short, the craft so long to lerne.'' - Chaucer > ``Ars longa, vita brevis'' - Hippocrates > Chet Ramey, ITS, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/ >