Re: Interpretation of escapes in expansions in pattern matching contexts
On Saturday, April 06, 2013 03:48:55 AM Dan Douglas wrote: Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me think this test should say no: x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi Here's more data. Some permutations of escaped and quoted literal and expanded quotes and escapes. ksh changed from matching dash's results to being like everything else, so now only bash and dash will match a literal escape adjacent to some escaped or quoted literal or non-literal escape. #!/usr/bin/env python3 import subprocess, itertools class Shell(list): def __init__(self, shell, cmds): self.shell = shell super().__init__([(x, self.__run(x)) for x in cmds]) def __iter__(self): while True: try: self[0][1].communicate() yield (lambda x: (x[0], x[1].returncode, self.shell))(self.pop(0)) except IndexError: raise StopIteration() def __run(self, cmd): return subprocess.Popen([self.shell, -c, cmd]) def main(): template = r'x=\\; case \\x in {0}{1}x) :;; *) false; esac' tests = [template.format(*x) for x in itertools.product(['${x}', '${x}', r'\\', r'\\'], repeat=2)] shells = [Shell(x, tests) for x in [bash, dash, ksh, mksh, zsh, bb, posh, jsh]] print( * 54, .join(x.shell for x in shells)) for row in zip(*shells): print({0:55}{1}.format(row[0][0], .join(str(test) + ( * len(shell)) for x, test, shell in row))) if __name__ == __main__: main() #bash dash ksh mksh zsh bb posh jsh (heirloom) # x=\\; case \\x in ${x}${x}x) :;; *) false; esac111 11 1 11 # x=\\; case \\x in ${x}${x}x) :;; *) false; esac 001 11 1 11 # x=\\; case \\x in ${x}\\x) :;; *) false; esac 111 11 1 11 # x=\\; case \\x in ${x}\\x) :;; *) false; esac111 11 1 11 # x=\\; case \\x in ${x}${x}x) :;; *) false; esac 101 11 1 11 # x=\\; case \\x in ${x}${x}x) :;; *) false; esac001 11 1 11 # x=\\; case \\x in ${x}\\x) :;; *) false; esac101 11 1 11 # x=\\; case \\x in ${x}\\x) :;; *) false; esac 101 11 1 11 # x=\\; case \\x in \\${x}x) :;; *) false; esac 111 11 1 11 # x=\\; case \\x in \\${x}x) :;; *) false; esac001 11 1 11 # x=\\; case \\x in x) :;; *) false; esac111 11 1 11 # x=\\; case \\x in x) :;; *) false; esac 111 11 1 11 # x=\\; case \\x in \\${x}x) :;; *) false; esac111 11 1 11 # x=\\; case \\x in \\${x}x) :;; *) false; esac 001 11 1 11 # x=\\; case \\x in x) :;; *) false; esac 111 11 1 11 # x=\\; case \\x in x) :;; *) false; esac111 11 1 11 -- Dan Douglas
Interpretation of escapes in expansions in pattern matching contexts
I couldn't find anything obvious in POSIX that implies which interpretation is correct. Assuming it's unspecified. Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me think this test should say no: x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi bash: yes ksh: no mksh: no zsh: no However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result depending on [[ ]] or case..esac (bug?), but otherwise it looks like a fairly random spread: x=\\x; case x in $x) echo yes;; *) echo no; esac bash: yes ksh: yes mksh: no posh: no zsh: no dash: yes bb: no jsh: no 18:42:44 jilles: ormaaj, I'm not sure if that's actually a bug 18:43:15 ormaaj: dunno. Bash seems unique in that respect 18:43:23 jilles: you're asking the shell to check if the string x matches the pattern stored in the variable x 19:32:51 jilles: freebsd sh and kmk_ash say no, dash says yes 19:33:40 jilles: Bourne shell says no -- Dan Douglas
Re: Interpretation of escapes in expansions in pattern matching contexts
On 04/06/2013 02:48 AM, Dan Douglas wrote: I couldn't find anything obvious in POSIX that implies which interpretation is correct. Assuming it's unspecified. Correct - POSIX does not specify [[ at all, so any behavior inside [[ is unspecified. However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result depending on [[ ]] or case..esac (bug?), but otherwise it looks like a fairly random spread: x=\\x; case x in $x) echo yes;; *) echo no; esac bash: yes The behavior inside case is specified by POSIX, and bash is correct in returning 'yes'. POSIX requires that each case pattern undergoes parameter expansion, and then the result of that expansion ('\x') is compared against the expansion of word ('x') according to pattern matching rules; http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13. Thos rules state that any character in the pattern that is quoted (which \x is) matches itself, and 'x' matches 'x'. ksh: yes correct mksh: no bug posh: no bug zsh: no bug dash: yes bb: no jsh: no I haven't heard of these two, but they are also bugs. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: Interpretation of escapes in expansions in pattern matching contexts
On 2013-04-06 07:01, Eric Blake wrote: bb: no jsh: no I haven't heard of these two, but they are also bugs. I assume bb is busybox ash. Chris pgppwY6f9jNaE.pgp Description: PGP signature
Re: Interpretation of escapes in expansions in pattern matching contexts
On Saturday, April 06, 2013 09:24:52 PM Chris Down wrote: On 2013-04-06 07:01, Eric Blake wrote: bb: no jsh: no I haven't heard of these two, but they are also bugs. I assume bb is busybox ash. Chris It's typically a symlink to busybox yes, which calls the shell. jsh is the default binary name produced by the heirloom build, though I've seen other names used. -- Dan Douglas
Re: Interpretation of escapes in expansions in pattern matching contexts
On 4/6/13 4:48 AM, Dan Douglas wrote: I couldn't find anything obvious in POSIX that implies which interpretation is correct. Assuming it's unspecified. Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me think this test should say no: x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi bash: yes ksh: no mksh: no zsh: no However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result depending on [[ ]] or case..esac (bug?), but otherwise it looks like a fairly random spread: x=\\x; case x in $x) echo yes;; *) echo no; esac These two cases should not be different. They undergo the same expansions, except that the conditional command adds quote removal, which doesn't matter in this case. In both cases, you ask the pattern matching code whether or not the string `x' matches the pattern `\x'. You invoke the same pattern matching code on the same patterns, why would you not get the same answer? Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Interpretation of escapes in expansions in pattern matching contexts
On Sat, 6 Apr 2013, Chet Ramey wrote: On 4/6/13 4:48 AM, Dan Douglas wrote: I couldn't find anything obvious in POSIX that implies which interpretation is correct. Assuming it's unspecified. Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me think this test should say no: x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi bash: yes ksh: no mksh: no zsh: no However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result depending on [[ ]] or case..esac (bug?), but otherwise it looks like a fairly random spread: x=\\x; case x in $x) echo yes;; *) echo no; esac These two cases should not be different. They undergo the same expansions, except that the conditional command adds quote removal, which doesn't matter in this case. In both cases, you ask the pattern matching code whether or not the string `x' matches the pattern `\x'. You invoke the same pattern matching code on the same patterns, why would you not get the same answer? In bash, the expansion differs when in [[ ... ]]: $ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi yes $ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi no But not in ksh93: $ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi no $ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi no -- Chris F.A. Johnson, http://cfajohnson.com/ Author: Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress) Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
Re: Interpretation of escapes in expansions in pattern matching contexts
On 4/6/13 9:59 PM, Chris F.A. Johnson wrote: In bash, the expansion differs when in [[ ... ]]: $ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi yes $ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi no OK. The [[ conditional command does pattern matching. The [ (test) command does string comparison. $x expands to `\x' in both cases. But not in ksh93: I'm going to assume ksh93 dequotes the variable. I don't know why. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
Re: Interpretation of escapes in expansions in pattern matching contexts
On Saturday, April 06, 2013 09:37:44 PM Chet Ramey wrote: On 4/6/13 4:48 AM, Dan Douglas wrote: I couldn't find anything obvious in POSIX that implies which interpretation is correct. Assuming it's unspecified. Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me think this test should say no: x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi bash: yes ksh: no mksh: no zsh: no However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result depending on [[ ]] or case..esac (bug?), but otherwise it looks like a fairly random spread: x=\\x; case x in $x) echo yes;; *) echo no; esac These two cases should not be different. They undergo the same expansions, except that the conditional command adds quote removal, which doesn't matter in this case. In both cases, you ask the pattern matching code whether or not the string `x' matches the pattern `\x'. You invoke the same pattern matching code on the same patterns, why would you not get the same answer? I expect they should be the same. I just noticed the discrepancy with ksh93 and wondered what gives. The original question I had in mind is: Is the quoting state of any part of a pattern determined lexically prior to expansions, or are any quotes/escapes within parts of pattern words that were generated by unquoted expansions re- interpreted as quotes by the pattern matcher? I had always thought the former, but now it looks to me like all these shells are saying no because they interpret the expanded words for quoting to determine which parts of the pattern should be literal. This appears to even apply to pathname expansion. $ touch '\foo' $ ksh -c 'x=\\f* IFS=; printf %s\\n $x' \foo $ bash -c 'x=\\f* IFS=; printf %s\\n $x' \f* I'm surprised different implementations are all across the board on this. -- Dan Douglas