Re: mapfile doesn't accept input from a pipe
It's a fair point but I think there may be a reasonable middle-ground, in which common pitfalls are briefly addressed in TFM, but the manual doesn't become bogged down with exhaustive detail of every possible pitfall. After all, information overload would just become another thing preventing readers from absorbing the information. - Original Message - From: "Greg Wooledge"To: Cc: Sent:Thu, 29 Jun 2017 15:39:20 -0400 Subject:Re: mapfile doesn't accept input from a pipe On Thu, Jun 29, 2017 at 03:22:24PM -0400, tetsu...@scope-eye.net wrote: > So I look at this not just as a RTFM issue, it's a pitfall built-in to > the design of the language, and programmers need to understand a bit > about the implementation of the language to understand what's going > on. As such I think it may be worth spelling it out a bit more > directly in terms of the implications here. For instance, stick it in > the help for 'read' and 'mapfile': If you include helpful text about every pitfall in every builtin, the bash documentation will become three times its current size.
Re: mapfile doesn't accept input from a pipe
On Thu, Jun 29, 2017 at 03:22:24PM -0400, tetsu...@scope-eye.net wrote: > So I look at this not just as a RTFM issue, it's a pitfall built-in to > the design of the language, and programmers need to understand a bit > about the implementation of the language to understand what's going > on. As such I think it may be worth spelling it out a bit more > directly in terms of the implications here. For instance, stick it in > the help for 'read' and 'mapfile': If you include helpful text about every pitfall in every builtin, the bash documentation will become three times its current size. Hell, it's probably less work to write out what things you *can* safely do, rather than what you can't. There aren't very many! But I don't think either of those things belongs in the reference manual. You can't learn proper shell programming from a reference. It's just too big, too convoluted, too full of historical constructs. You need a more focused document.
Re: mapfile doesn't accept input from a pipe
I think that when programmers first learn shell programming, this is a hard piece of information to effectively convey. The Bash documentation provides the important facts: - Subshells are quietly and automatically constructed for a variety of shell programming constructs, including pipelines - Code run in a subshell can't affect the parent shell's execution environment But what's important for a new shell programmer, and so easy to miss, is the implication that comes from these facts: piping a command's output into "read" seems like a perfectly reasonable thing to do if you haven't wrapped your head around those two facts and their implications. And if you do try "cmd | read x", it's not considered to be any kind of error or anything, the side-effects of "read" are just carried out and then quietly discarded. So I look at this not just as a RTFM issue, it's a pitfall built-in to the design of the language, and programmers need to understand a bit about the implementation of the language to understand what's going on. As such I think it may be worth spelling it out a bit more directly in terms of the implications here. For instance, stick it in the help for 'read' and 'mapfile': "Care must be taken when using 'read' in a pipeline or another form of subshell environment, as this may cause the data that's read to be lost. See BASH SCRIPTING BASICS (section whatever) for more information." It's the same basic information, but it's better targeted: It lives in the help for the command that's apparently "failing", it says your data may be "lost", and it points at a relevant section of a newbie FAQ that explains why. I'd also wonder if it's maybe worth trying to detect cases of this kind of thing and flagging them as errors... Like if "read" or "mapfile" wind up as the only command in a subshell, the command's side-effects are going to be lost, so it's an error. Of course that does nothing for the wide variety of other cases impacted by this issue, but maybe it's still worth it... If someone does "cmd | read x", and "lastpipe" isn't on AND in effect, it's almost certainly a mistake... (Of course, there's ALWAYS the possibility that someone relies on the current behavior - for instance to see if "read" would succeed, or trigger a SIGPIPE in "cmd" after a certain amount of data is read...) The whole issue of sub-shells is kind of a mess IMO - there's all these cases where one gets created automatically, and the (cmd) syntax which exists specifically to run a command in a subshell, looks to the uninitiated like a simple command-grouping syntax, because that's how parentheses work in C and many other languages... And if something winds up in a subshell that shouldn't be, its side-effects on the shell environment are simply lost without warning. Ideally, forking the shell shouldn't be so baked-in to the language that people wind up tripping over it. Instead these cases that require parallelism should use threading and synchronized access to a shared environment. (So a pipeline could contain *multiple* built-ins or shell functions with side-effects for the shell's environment) But I think that's a difficult direction to pursue, unfortunately, and I'm guessing it's not one that will happen in Bash... (On the bright side it sounds as though POSIX allows it...) - Original Message - From: chet.ra...@case.edu To:"Keith Thompson", "Eduardo_A._Bustamante_López" Cc: , Sent:Thu, 29 Jun 2017 14:07:32 -0400 Subject:Re: mapfile doesn't accept input from a pipe On 6/29/17 12:38 PM, Keith Thompson wrote: > I suggest that it would be worthwhile to mention this issue in the > documentation. "Each command in a pipeline is executed as a separate process (i.e., in a subshell). See COMMAND EXECUTION ENVIRONMENT for a description of a subshell environment. If the lastpipe option is enabled using the shopt builtin (see the description of shopt below), the last element of a pipeline may be run by the shell process." or maybe "Builtin commands that are invoked as part of a pipeline are also executed in a subshell environment." -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/
Re: mapfile doesn't accept input from a pipe
On 6/29/17 12:38 PM, Keith Thompson wrote: > I suggest that it would be worthwhile to mention this issue in the > documentation. "Each command in a pipeline is executed as a separate process (i.e., in a subshell). See COMMAND EXECUTION ENVIRONMENT for a description of a subshell environment. If the lastpipe option is enabled using the shopt builtin (see the description of shopt below), the last element of a pipeline may be run by the shell process." or maybe "Builtin commands that are invoked as part of a pipeline are also executed in a subshell environment." -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: mapfile doesn't accept input from a pipe
On Thu, Jun 29, 2017 at 6:56 AM, Eduardo A. Bustamante Lópezwrote: > On Wed, Jun 28, 2017 at 07:08:27PM -0700, Keith Thompson wrote: > [...] >> mapfile REDIRECT < /tmp/input.txt >> cat /tmp/input.txt | mapfile PIPE > > The `mapfile PIPE' is a piece of a pipeline, and as such, it runs in a > subshell (different process). > > See: http://mywiki.wooledge.org/BashFAQ/024 OK, that makes sense, and of course the same thing applies to "read". I suggest that it would be worthwhile to mention this issue in the documentation. The fact that it's a FAQ suggests that people are likely to run into it.
Re: mapfile doesn't accept input from a pipe
On Wed, Jun 28, 2017 at 07:08:27PM -0700, Keith Thompson wrote: > Description: > The "mapfile" command works correctly if stdin is redirected > from a file, but not if it's from a pipe. This is because each command in a pipeline is executed in its own subshell. Not a bug. If you need to read input from a command with mapfile (or read), use a process substitution. > cat /tmp/input.txt | mapfile PIPE mapfile PIPE < /tmp/input.txt mapfile PIPE < <(a real program, not cat)
Re: mapfile doesn't accept input from a pipe
On Wed, Jun 28, 2017 at 07:08:27PM -0700, Keith Thompson wrote: [...] > mapfile REDIRECT < /tmp/input.txt > cat /tmp/input.txt | mapfile PIPE The `mapfile PIPE' is a piece of a pipeline, and as such, it runs in a subshell (different process). See: http://mywiki.wooledge.org/BashFAQ/024 -- Eduardo Bustamante https://dualbus.me/
Typo in bash manual: "QUANTUMP"
In the documentation for the "mapfile" builtin command: '-C' Evaluate CALLBACK each time QUANTUMP lines are read. The '-c' option specifies QUANTUM. "QUANTUMP" should be "QUANTUM". In the latest sources cloned from git://git.savannah.gnu.org/bash.git, this occurs in: bash/doc/bash.info bash/doc/bashref.info
mapfile doesn't accept input from a pipe
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 -Wno-parentheses -Wno-format-security uname output: Linux bomb20 4.8.0-46-generic #49-Ubuntu SMP Fri Mar 31 13:57:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-unknown-linux-gnu Bash Version: 4.4 Patch Level: 12 Release Status: maint Description: The "mapfile" command works correctly if stdin is redirected from a file, but not if it's from a pipe. Demonstrated with several versions including the latest bash-snap-20170626 Repeat-By: This script demonstrates the problem: __CUT_HERE__ #!/bin/bash printf 'one\ntwo\nthree\n' > /tmp/input.txt mapfile REDIRECT < /tmp/input.txt cat /tmp/input.txt | mapfile PIPE echo "\$REDIRECT has ${#REDIRECT[@]} elements" echo "\$PIPE has ${#PIPE[@]} elements" if [ ${#REDIRECT[@]} -eq 3 ] && [ ${#PIPE[@]} -eq 3 ] ; then echo PASS exit 0 else echo FAIL exit 1 fi __AND_HERE__