Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Wed, May 22, 2019 at 10:23:04PM +, Charles-Henri Gros wrote: > But unfortunately, grep was just illustrative, I'm using another tool > that takes a regex but has no "-F" option 1. The questioner's first description of the problem/question will be misleading. 9. All examples given by the questioner will be broken, misleading, wrong, and/or not representative of the actual question. 25. The newbie won't accept any answer that uses practical or standard tools. 26. The newbie will not TELL you about this restriction until you have wasted half an hour.
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Date:Wed, 22 May 2019 22:23:04 + From:Charles-Henri Gros Message-ID: | But unfortunately, grep was just illustrative, I'm using another tool | that takes a regex but has no "-F" option (though admittedly with some | effort I could add one, I wrote the tool in question). You can still do the sed to hide any $ in the command line the way you were doing. The important thing is to not expose the results to pathname expansion, and if you're going to use the shell to break apart the file names (field splitting) make sure IFS is set correctly - you might find IFS=$'\n' works better for your usage than the default (so filenames with spaces don't give problems). You might also want to use Chet's suggestion, and disable pathname expansion with "set -f". But this kind of thing is what happens when you don't povide all of the info about the problem you're having - people tend to provide answers to the problem you say that you have, rather than the actual issue. It is all good (and helpful) to find a simple test case for a problem you're seeing, and provide that as well - but always give the actual problem details. Here without knowing what kind of input your "tool in question" takes it is impossible for anyone to work out what a good solution would be. | Yes I'm not expecting any special characters except "$". It is best not to make too many assumptions - remember that even '.' is special in RE's and '.' is very common in filenames. kre
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 3:13 PM, Robert Elz wrote: > Date:Wed, 22 May 2019 17:34:22 + > From:Charles-Henri Gros > Message-ID: > > > | The problem I'm trying to solve is to iterate over regex-escaped file > | names obtained from a "find" command. I don't know how to make this > | work. It works with other versions of bash and with other shells. > > You were relying upon a common bug, which has been fixed in bash, but > your technique is all wrong, you don't need any kind of loop at all, not > a for loop, and not the while read loop that Greg suggested. > > find -print produces a list of names, one per line. Those are simple > strings, which fgrep (or grep -F as Andreas suggested) can handle finding. > > What I'd do is > > fgrep "$(find -print)" wherever Interesting, I didn't realize you could pass newline-separated patterns to "grep" on the command line. Good to know for the future. But unfortunately, grep was just illustrative, I'm using another tool that takes a regex but has no "-F" option (though admittedly with some effort I could add one, I wrote the tool in question). > > (You can use grep -F if you have an aversion to using its traditional name, > but fgrep was once a different program to grep / egrep). > > This version will have a problem with filenames with embedded newlines, > but so did your original, so I am simply assuming that you have none of > those (using any variant of grep to search for strings containing newlines > tends to be "difficult" as grep is a line at a time tool). Yes I'm not expecting any special characters except "$". > > If you version of grep cannot handle the pattern list not having a > terminating \n (the $() removes it) then you can add it back > > fgrep "$(find ... -print)"$'\n' wherever. > > You're probably still going to need a | into sed inside the command > substitution, as I doubt that you actually want to look for filenames > in the format that find prints them (you have never shown your actual > command) and I suspect that you want to delete the pathname component > (a leading "./" or whatever) and it isn't clear what you want to > happen with filenames in subdirectories. But none of those manipulations > will affect anything. > > The other difference between this method and the one that you were > using, is that this one will mix up the output for all of the different > file names (it reads the target files just once, looking for all of the > filenames simultaneously) whereas your original scheme looked for each > file name in the target sequentially (re-reading the target file(s) over > and over again for each new file name). That would group output lines > for each file name together, whereas the technique above does not. -- Charles-Henri Gros
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Date:Wed, 22 May 2019 17:34:22 + From:Charles-Henri Gros Message-ID: | The problem I'm trying to solve is to iterate over regex-escaped file | names obtained from a "find" command. I don't know how to make this | work. It works with other versions of bash and with other shells. You were relying upon a common bug, which has been fixed in bash, but your technique is all wrong, you don't need any kind of loop at all, not a for loop, and not the while read loop that Greg suggested. find -print produces a list of names, one per line. Those are simple strings, which fgrep (or grep -F as Andreas suggested) can handle finding. What I'd do is fgrep "$(find -print)" wherever (You can use grep -F if you have an aversion to using its traditional name, but fgrep was once a different program to grep / egrep). This version will have a problem with filenames with embedded newlines, but so did your original, so I am simply assuming that you have none of those (using any variant of grep to search for strings containing newlines tends to be "difficult" as grep is a line at a time tool). If you version of grep cannot handle the pattern list not having a terminating \n (the $() removes it) then you can add it back fgrep "$(find ... -print)"$'\n' wherever. You're probably still going to need a | into sed inside the command substitution, as I doubt that you actually want to look for filenames in the format that find prints them (you have never shown your actual command) and I suspect that you want to delete the pathname component (a leading "./" or whatever) and it isn't clear what you want to happen with filenames in subdirectories. But none of those manipulations will affect anything. The other difference between this method and the one that you were using, is that this one will mix up the output for all of the different file names (it reads the target files just once, looking for all of the filenames simultaneously) whereas your original scheme looked for each file name in the target sequentially (re-reading the target file(s) over and over again for each new file name). That would group output lines for each file name together, whereas the technique above does not. kre
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Wed, May 22, 2019 at 05:34:22PM +, Charles-Henri Gros wrote: [cut] > The problem I'm trying to solve is to iterate over regex-escaped file > names obtained from a "find" command. I don't know how to make this > work. It works with other versions of bash and with other shells. > > The original is closer to something like this: > > for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" > someinput; done You may want to use "grep -F" to match fixed strings (not regular expressions): find ... -exec grep -F -e {} someinput \; Add -x to grep if you want full line matches only. Tis is assuming you'd want to look for the found pathnames in "someinput". > > It used to work. Now it doesn't. I do not know how to make it work again. > > > -- > Charles-Henri Gros > -- Kusalananda Sweden
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Mai 22 2019, Charles-Henri Gros wrote: > The file name is the regex (argument to "-e"), not the file "grep" > reads. I want to check that some text file contains a reference to a file. > > But it looks like this would work: > > for file in $(find ...); do grep -e "$(echo -n "$file" | sed 's/\$/\\$/g')" > someinput; done Use grep -F instead. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 3:14 PM, Charles-Henri Gros wrote: > That's what I find a bit surprising (but shells are complicated, so > maybe this is right. All I know is that the code used to work). I didn't > think glob expansions applied to command expansions. Command substitution is one of the word expansions, as is parameter (variable) expansion. Pathname expansion (globbing) is applied to the results of the other expansions and word splitting. The order is detailed here: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06 > > All I want here is word split (which is why I can't use quotes) You could always try turning off pathname expansion temporarily with `set -f'. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 10:47 AM, Greg Wooledge wrote: > On Wed, May 22, 2019 at 05:34:22PM +, Charles-Henri Gros wrote: >> On 5/22/19 5:43 AM, Greg Wooledge wrote: >>> Standard disclaimers apply. Stop using unquoted variables and these >>> bugs will stop affecting you. Nevertheless, Chet may want to take a >>> peek. >> What unquoted variables? Are you talking about the "$()" expansion? > Yes. I used a variable instead of a command substitution to make it > easier to reproduce the problem. Both have the same behavior in this > case. That's what I find a bit surprising (but shells are complicated, so maybe this is right. All I know is that the code used to work). I didn't think glob expansions applied to command expansions. All I want here is word split (which is why I can't use quotes) > >> The problem I'm trying to solve is to iterate over regex-escaped file >> names obtained from a "find" command. I don't know how to make this >> work. It works with other versions of bash and with other shells. > First step: do not "regex-escape" them, whatever that means. Just use > the actual filenames as printed by find -print0. > >> The original is closer to something like this: >> >> for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" >> someinput; done > Yeah, that's just the wrong approach. It's also the first thing on > the BashPitfalls page[1] (for a good reason). > > You have two choices here: > > 1) Use find -exec. > >find ... -exec grep -e someinput /dev/null {} + > > 2) Use find -print0 and a bash while read loop. (NOT a for loop.) > >find ... -print0 | >while IFS= read -rd '' file; do > something "$file" >done > >(A variant of this uses < <() instead of a pipeline, so that the while >loop runs in the main shell and variable assignments can persist.) > > Since you only show a simple grep as your action, find -exec is a better > choice for this problem. (Assuming you didn't fatally misrepresent the > problem.) Calling grep once for every file would be inefficient. I don't think I fatally misrepresented the problem, however I do think that you fatally misunderstood it (FWIW I know about -print0 and xargs -0) The file name is the regex (argument to "-e"), not the file "grep" reads. I want to check that some text file contains a reference to a file. But it looks like this would work: for file in $(find ...); do grep -e "$(echo -n "$file" | sed 's/\$/\\$/g')" someinput; done -- Charles-Henri Gros
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Wed, May 22, 2019 at 07:14:44PM +, Charles-Henri Gros wrote: > The file name is the regex (argument to "-e"), not the file "grep" > reads. I want to check that some text file contains a reference to a file. > > But it looks like this would work: > > for file in $(find ...); do grep -e "$(echo -n "$file" | sed 's/\$/\\$/g')" > someinput; done That still has the same problems. I've already given the BashPitfalls link so I won't repeat that whole speech. Since it seems you want to repeat your search for every file name individually, the while read loop becomes a viable choice. find ... -print0 | while IFS= read -rd '' file; do grep -F -e "$file" /some/textfile done This is still going to be inefficient compared to running *one* grep with all of the input filenames as matchable patterns, but if you're set on doing it the slow way, so be it. The faster way can be done safely as long as there aren't *too* many filenames to pass: args=() while IFS= read -rd '' file; do args+=(-e "$file") done < <(find ... -print0) grep -F "${args[@]}" /some/textfile That will fail if there are too many arguments. A less-safe but still fast way would involve generating a newline-delimited list of the matchable filename-patterns, which means it'll fail if any filenames have newlines in them. Ignoring that for now, we get: grep -F -f <(find ... -print) /some/textfile You can add something like ! -name $'*\n*' to the find arguments to prevent such filenames from being handled. (Actually, the whole thing fails pretty catastrophically if any of your filename-patterns have newlines in them, since grep can't handle patterns with newlines... so you'd want to filter those out even in the array-based alternative.) In any case, for f in $(find...) is always wrong. Sorry, but it's true. There's no salvaging it.
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Wed, May 22, 2019 at 05:34:22PM +, Charles-Henri Gros wrote: > On 5/22/19 5:43 AM, Greg Wooledge wrote: > > Standard disclaimers apply. Stop using unquoted variables and these > > bugs will stop affecting you. Nevertheless, Chet may want to take a > > peek. > > What unquoted variables? Are you talking about the "$()" expansion? Yes. I used a variable instead of a command substitution to make it easier to reproduce the problem. Both have the same behavior in this case. > The problem I'm trying to solve is to iterate over regex-escaped file > names obtained from a "find" command. I don't know how to make this > work. It works with other versions of bash and with other shells. First step: do not "regex-escape" them, whatever that means. Just use the actual filenames as printed by find -print0. > The original is closer to something like this: > > for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" > someinput; done Yeah, that's just the wrong approach. It's also the first thing on the BashPitfalls page[1] (for a good reason). You have two choices here: 1) Use find -exec. find ... -exec grep -e someinput /dev/null {} + 2) Use find -print0 and a bash while read loop. (NOT a for loop.) find ... -print0 | while IFS= read -rd '' file; do something "$file" done (A variant of this uses < <() instead of a pipeline, so that the while loop runs in the main shell and variable assignments can persist.) Since you only show a simple grep as your action, find -exec is a better choice for this problem. (Assuming you didn't fatally misrepresent the problem.) Calling grep once for every file would be inefficient. [1] https://mywiki.wooledge.org/BashPitfalls#pf1
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 5:43 AM, Greg Wooledge wrote: > On Wed, May 22, 2019 at 05:25:43PM +0700, Robert Elz wrote: >> Date:Tue, 21 May 2019 22:11:20 + >> From:Charles-Henri Gros >> Message-ID: >> >> >> | The existence or not of the file should not have any effect. >> >> But it does, and is intended to. If the mattern matches a file >> (when patyhname expanded as a result of the unquoted command substitution) >> you get the file name produced. If it does not match a file, >> the pattern is left untouched. That is the way that things are >> supposed to work. > With glob metacharacters, sure. But none of the characters in his > variable are glob metacharacters. > > There is definitely something weird happening here. > > wooledg:/tmp/x$ echo "$BASH_VERSION" > 5.0.3(1)-release > wooledg:/tmp/x$ touch 'a$.class' > wooledg:/tmp/x$ i='a\$.class'; echo {$i} "{$i}" > {a\$.class} {a\$.class} > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a$.class {a\$.class} > > Other versions of bash, plus ksh and dash, don't behave this way. > > wooledg:/tmp/x$ bash-2.05b > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ bash-4.4 > wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ ksh > $ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > wooledg:/tmp/x$ dash > $ i='a\$.class'; echo $i "{$i}" > a\$.class {a\$.class} > > It seems to be unique to bash 5. If it's a bug fix, then I'm not > understanding the rationale. Backslashes shouldn't be consumed during > glob expansion. > > This is also not limited to $ alone. It happens with letters too. > > wooledg:/tmp/x$ touch i > wooledg:/tmp/x$ i='\i' j='\j' > wooledg:/tmp/x$ echo $i $j > i \j > > Standard disclaimers apply. Stop using unquoted variables and these > bugs will stop affecting you. Nevertheless, Chet may want to take a > peek. What unquoted variables? Are you talking about the "$()" expansion? The problem I'm trying to solve is to iterate over regex-escaped file names obtained from a "find" command. I don't know how to make this work. It works with other versions of bash and with other shells. The original is closer to something like this: for file in $(find ... | sed 's/\$/\\$/g'); do grep -e "$file" someinput; done It used to work. Now it doesn't. I do not know how to make it work again. -- Charles-Henri Gros
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On 5/22/19 9:33 AM, Robert Elz wrote: > Date:Wed, 22 May 2019 08:43:00 -0400 > From:Greg Wooledge > Message-ID: <20190522124300.gz1...@eeg.ccf.org> > > | It seems to be unique to bash 5. If it's a bug fix, then I'm not > | understanding the rationale. Backslashes shouldn't be consumed during > | glob expansion. > > They should - when a pattern comes from an expansion (be that a > variable expansion, or as here, a command substitution) there needs > to be a way to indicate whether the potential magic chars are in > fact intended as magic chars, or as literals. \ is used for that. There is more discussion in http://lists.gnu.org/archive/html/bug-bash/2019-02/msg00151.html -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Date:Wed, 22 May 2019 08:43:00 -0400 From:Greg Wooledge Message-ID: <20190522124300.gz1...@eeg.ccf.org> | It seems to be unique to bash 5. If it's a bug fix, then I'm not | understanding the rationale. Backslashes shouldn't be consumed during | glob expansion. They should - when a pattern comes from an expansion (be that a variable expansion, or as here, a command substitution) there needs to be a way to indicate whether the potential magic chars are in fact intended as magic chars, or as literals. \ is used for that. If quoted, everything is literal, and there's no issue, but when unquoted there needs to be this mechanism. So, I think it was a bug fix (I recently made very similar fixes to the NetBSD shell). Uses of this kind of thing are obscure, but they exist. Here, the $ isn't magic to pathname expansion (glob is not a RE) so the \ doesn't do anything useful, but consider ls $( printf %s '\**.c' ) what that should do is list all files that end in .c and start with an asterisk (star). There the first '*' is to be treated literally, and the 2nd is the "match anything" metc char. Only the presence of the \ can distinguish those two cases. (Well, here one could make the pattern be [*]*.c but that isn't always easy, or even possible). kre
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
On Wed, May 22, 2019 at 05:25:43PM +0700, Robert Elz wrote: > Date:Tue, 21 May 2019 22:11:20 + > From:Charles-Henri Gros > Message-ID: > > > | The existence or not of the file should not have any effect. > > But it does, and is intended to. If the mattern matches a file > (when patyhname expanded as a result of the unquoted command substitution) > you get the file name produced. If it does not match a file, > the pattern is left untouched. That is the way that things are > supposed to work. With glob metacharacters, sure. But none of the characters in his variable are glob metacharacters. There is definitely something weird happening here. wooledg:/tmp/x$ echo "$BASH_VERSION" 5.0.3(1)-release wooledg:/tmp/x$ touch 'a$.class' wooledg:/tmp/x$ i='a\$.class'; echo {$i} "{$i}" {a\$.class} {a\$.class} wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" a$.class {a\$.class} Other versions of bash, plus ksh and dash, don't behave this way. wooledg:/tmp/x$ bash-2.05b wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" a\$.class {a\$.class} wooledg:/tmp/x$ bash-4.4 wooledg:/tmp/x$ i='a\$.class'; echo $i "{$i}" a\$.class {a\$.class} wooledg:/tmp/x$ ksh $ i='a\$.class'; echo $i "{$i}" a\$.class {a\$.class} wooledg:/tmp/x$ dash $ i='a\$.class'; echo $i "{$i}" a\$.class {a\$.class} It seems to be unique to bash 5. If it's a bug fix, then I'm not understanding the rationale. Backslashes shouldn't be consumed during glob expansion. This is also not limited to $ alone. It happens with letters too. wooledg:/tmp/x$ touch i wooledg:/tmp/x$ i='\i' j='\j' wooledg:/tmp/x$ echo $i $j i \j Standard disclaimers apply. Stop using unquoted variables and these bugs will stop affecting you. Nevertheless, Chet may want to take a peek.
Re: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Date:Tue, 21 May 2019 22:11:20 + From:Charles-Henri Gros Message-ID: | The existence or not of the file should not have any effect. But it does, and is intended to. If the mattern matches a file (when patyhname expanded as a result of the unquoted command substitution) you get the file name produced. If it does not match a file, the pattern is left untouched. That is the way that things are supposed to work. I suspect that you meant to say for i in "$(echo "a\\\$.class")"; do echo "$i"; done then there would be no pathname expansion happening (more correctly, there still is, but the pathname to be expanded contains no magic chars, only chars that match literally, so you either get the file with the exact same name, or the pattern untouched, which is the same thing - shells generally optimise away the attempt to match in that case, as the result is always known in advance). kre
Backslash mysteriously disappears in command expansion when unescaping would reference an existing file
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -g -O2 -fdebug-prefix-map=/build/bash-Dl674z/bash-5.0=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Wno-parentheses -Wno-format-security uname output: Linux d-us6a-ubuntu-03 5.0.0-13-generic #14-Ubuntu SMP Mon Apr 15 14:59:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu Bash Version: 5.0 Patch Level: 3 Release Status: release Description: Backslash mysteriously disappears in command expansion when unescaping would reference an existing file Repeat-By: > touch a\$.class > for i in $(echo "a\\\$.class"); do echo "$i"; done a$.class > rm a\$.class > for i in $(echo "a\\\$.class"); do echo "$i"; done a\$.class The existence or not of the file should not have any effect.