Hello, > On Jul 26, 2016, at 23:09, Christoph Anton Mitterer <[email protected]> > wrote: > > I've always had the impression that ^ and $ were the end/begin anchor > of the current pattern, and since e.g. grep/sed work normally in terms > of lines the start/end of lines. [...] > What I found a bit strange is that e.g.: > printf '' | sed 's/^/foo/' > > doesn't produce foo and that e.g. > printf '' | grep '^' > don't match. > > Why? Or better said, which part of POSIX mandates this? Or is it simply > "no stdin, nothing happens"?
Exactly! The command "printf '' " sends no output, and it is equivalent to redirection from /dev/null, which means sed immediately receives an end-of-file marker and does not try to execute any command. printf with *any* output (with newlines or not) will cause 'sed' and 'grep' to read some characters, and then to try to execute commands or match patterns on the input. This can be demonstrated using 'strace' on GNU/Linux machines. The commands below run 'sed' with both printf and /dev/null, and 'strace' will report the 'read' system-call. The first 'read(3,...)' can be ignored, it is the OS reading a shared library. The second 'read(0,...)' is the interesting one: The first "0" indicates reading from STDIN. sed tries to read upto 4096 bytes from STDIN, and the returned value is zero (following the equal sign). Zero value indicates an end-of-file - meaning there is no input at all, and sed will not try to execute any commands. Notice that printf with an empty string and /dev/null result in the same behavior: $ strace -e read sed 's/^/foo/' < /dev/null read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832 read(0, "", 4096) = 0 +++ exited with 0 +++ $ printf '' | strace -e read sed 's/^/foo/' read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832 read(0, "", 4096) = 0 +++ exited with 0 +++ However, if even one character is provided in STDIN, The read() function will return it, and sed will try to execute the pattern/command on the input: $ printf 'a' | strace -e read sed 's/^/foo/' read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"..., 832) = 832 read(0, "a", 4096) = 1 read(0, "", 4096) = 0 read(0, "", 4096) = 0 fooa+++ exited with 0 + One possible source of confusion is 'echo' vs 'printf': echo by default automatically adds a newline. Thus, the command: echo '' | sed 's/^/foo/' does work as expected because there is some input (one byte: a newline). Where as this command does not, since there is no input at all: printf '' | sed 's/^/foo/' 'grep' follows the same principle, and can be examined using: printf '' | strace -e read grep -q '.' && echo match || echo no-match strace -e read grep -q '.' </dev/null && echo match || echo no-match printf 'a' | strace -e read grep -q '.' && echo match || echo no-match Others can perhaps elaborate regarding POSIX standard. From a cursory look, it seems the wording for 'grep' and 'sed' imply the output is tied to having input, while there is mandatory default output for 'wc' regardless of input ( http://pubs.opengroup.org/onlinepubs/9699919799/utilities/wc.html#tag_20_154_10 ). Hope this helps, - assaf P.S. A minor nitpick: coreutils is a separate project from grep or sed. grep questions should be sent to [email protected] , and sed questions should be sent to [email protected] .
