Re: Interpretation of escapes in expansions in pattern matching contexts

2013-09-17 Thread Dan Douglas
On Saturday, April 06, 2013 03:48:55 AM Dan Douglas wrote:
 Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
 think this test should say no:
 
 x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
 

Here's more data. Some permutations of escaped and quoted literal and expanded
quotes and escapes. ksh changed from matching dash's results to being like
everything else, so now only bash and dash will match a literal escape adjacent
to some escaped or quoted literal or non-literal escape.

#!/usr/bin/env python3

import subprocess, itertools

class Shell(list):
def __init__(self, shell, cmds):
self.shell = shell
super().__init__([(x, self.__run(x)) for x in cmds])

def __iter__(self):
while True:
try:
self[0][1].communicate()
yield (lambda x: (x[0], x[1].returncode, 
self.shell))(self.pop(0))
except IndexError:
raise StopIteration()

def __run(self, cmd):
return subprocess.Popen([self.shell, -c, cmd])

def main():
template = r'x=\\; case \\x in {0}{1}x) :;; *) false; esac'
tests = [template.format(*x) for x in itertools.product(['${x}', '${x}', 
r'\\', r'\\'], repeat=2)]
shells = [Shell(x, tests) for x in [bash, dash, ksh, mksh, zsh, 
bb, posh, jsh]]

print(  * 54,  .join(x.shell for x in shells))
for row in zip(*shells):
print({0:55}{1}.format(row[0][0], .join(str(test) + (  * 
len(shell)) for x, test, shell in row)))

if __name__ == __main__:
main()

#bash dash ksh mksh zsh 
bb posh jsh (heirloom)
# x=\\; case \\x in ${x}${x}x) :;; *) false; esac111   11   
1  11   
# x=\\; case \\x in ${x}${x}x) :;; *) false; esac  001   11   
1  11   
# x=\\; case \\x in ${x}\\x) :;; *) false; esac  111   11   
1  11   
# x=\\; case \\x in ${x}\\x) :;; *) false; esac111   11   
1  11   
# x=\\; case \\x in ${x}${x}x) :;; *) false; esac  101   11   
1  11   
# x=\\; case \\x in ${x}${x}x) :;; *) false; esac001   11   
1  11   
# x=\\; case \\x in ${x}\\x) :;; *) false; esac101   11   
1  11   
# x=\\; case \\x in ${x}\\x) :;; *) false; esac  101   11   
1  11   
# x=\\; case \\x in \\${x}x) :;; *) false; esac  111   11   
1  11   
# x=\\; case \\x in \\${x}x) :;; *) false; esac001   11   
1  11   
# x=\\; case \\x in x) :;; *) false; esac111   11   
1  11   
# x=\\; case \\x in x) :;; *) false; esac  111   11   
1  11   
# x=\\; case \\x in \\${x}x) :;; *) false; esac111   11   
1  11   
# x=\\; case \\x in \\${x}x) :;; *) false; esac  001   11   
1  11   
# x=\\; case \\x in x) :;; *) false; esac  111   11   
1  11   
# x=\\; case \\x in x) :;; *) false; esac111   11   
1  11

-- 
Dan Douglas



Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
I couldn't find anything obvious in POSIX that implies which interpretation is
correct. Assuming it's unspecified.

Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
think this test should say no:

x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi

bash: yes
ksh:  no
mksh: no
zsh:  no

However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
depending on [[ ]] or case..esac (bug?), but otherwise it looks like a
fairly random spread:

x=\\x; case x in $x) echo yes;; *) echo no; esac

bash: yes
ksh:  yes
mksh: no
posh: no
zsh:  no
dash: yes
bb:   no
jsh:  no

18:42:44 jilles: ormaaj, I'm not sure if that's actually a bug
18:43:15 ormaaj: dunno. Bash seems unique in that respect
18:43:23 jilles: you're asking the shell to check if the string x matches the 
pattern stored in the variable x
19:32:51 jilles: freebsd sh and kmk_ash say no, dash says yes
19:33:40 jilles: Bourne shell says no

-- 
Dan Douglas



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Eric Blake
On 04/06/2013 02:48 AM, Dan Douglas wrote:
 I couldn't find anything obvious in POSIX that implies which interpretation is
 correct. Assuming it's unspecified.

Correct - POSIX does not specify [[ at all, so any behavior inside [[ is
unspecified.

 
 However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
 depending on [[ ]] or case..esac (bug?), but otherwise it looks like a
 fairly random spread:
 
 x=\\x; case x in $x) echo yes;; *) echo no; esac
 
 bash: yes

The behavior inside case is specified by POSIX, and bash is correct in
returning 'yes'.  POSIX requires that each case pattern undergoes
parameter expansion, and then the result of that expansion ('\x') is
compared against the expansion of word ('x') according to pattern
matching rules;
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13.
 Thos rules state that any character in the pattern that is quoted
(which \x is) matches itself, and 'x' matches 'x'.

 ksh:  yes

correct

 mksh: no

bug

 posh: no

bug

 zsh:  no

bug

 dash: yes
 bb:   no
 jsh:  no

I haven't heard of these two, but they are also bugs.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Chris Down
On 2013-04-06 07:01, Eric Blake wrote:
  bb:   no
  jsh:  no

 I haven't heard of these two, but they are also bugs.

I assume bb is busybox ash.

Chris


pgppwY6f9jNaE.pgp
Description: PGP signature


Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
On Saturday, April 06, 2013 09:24:52 PM Chris Down wrote:
 On 2013-04-06 07:01, Eric Blake wrote:
   bb:   no
   jsh:  no
 
  I haven't heard of these two, but they are also bugs.
 
 I assume bb is busybox ash.
 
 Chris

It's typically a symlink to busybox yes, which calls the shell. jsh is the 
default binary name produced by the heirloom build, though I've seen other 
names used. 
-- 
Dan Douglas



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Chet Ramey
On 4/6/13 4:48 AM, Dan Douglas wrote:
 I couldn't find anything obvious in POSIX that implies which interpretation is
 correct. Assuming it's unspecified.
 
 Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
 think this test should say no:
 
 x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
 
 bash: yes
 ksh:  no
 mksh: no
 zsh:  no
 
 However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
 depending on [[ ]] or case..esac (bug?), but otherwise it looks like a
 fairly random spread:
 
 x=\\x; case x in $x) echo yes;; *) echo no; esac

These two cases should not be different.  They undergo the same expansions,
except that the conditional command adds quote removal, which doesn't
matter in this case.  In both cases, you ask the pattern matching code
whether or not the string `x' matches the pattern `\x'.

You invoke the same pattern matching code on the same patterns, why would
you not get the same answer?

Chet
-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Chris F.A. Johnson

On Sat, 6 Apr 2013, Chet Ramey wrote:


On 4/6/13 4:48 AM, Dan Douglas wrote:

I couldn't find anything obvious in POSIX that implies which interpretation is
correct. Assuming it's unspecified.

Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me
think this test should say no:

x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi

bash: yes
ksh:  no
mksh: no
zsh:  no

However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
depending on [[ ]] or case..esac (bug?), but otherwise it looks like a
fairly random spread:

x=\\x; case x in $x) echo yes;; *) echo no; esac


These two cases should not be different.  They undergo the same expansions,
except that the conditional command adds quote removal, which doesn't
matter in this case.  In both cases, you ask the pattern matching code
whether or not the string `x' matches the pattern `\x'.

You invoke the same pattern matching code on the same patterns, why would
you not get the same answer?


  In bash, the expansion differs when in  [[ ... ]]:

$ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
yes
$ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi
no

  But not in ksh93:

$ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
no
$ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi
no



--
   Chris F.A. Johnson, http://cfajohnson.com/
   Author:
   Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress)
   Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Chet Ramey
On 4/6/13 9:59 PM, Chris F.A. Johnson wrote:

   In bash, the expansion differs when in  [[ ... ]]:
 
 $ x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
 yes
 $ x=\\x; if [ x == $x ]; then echo yes; else echo no; fi
 no

OK.  The [[ conditional command does pattern matching.  The [ (test)
command does string comparison.  $x expands to `\x' in both cases.

 
   But not in ksh93:

I'm going to assume ksh93 dequotes the variable.  I don't know why.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/



Re: Interpretation of escapes in expansions in pattern matching contexts

2013-04-06 Thread Dan Douglas
On Saturday, April 06, 2013 09:37:44 PM Chet Ramey wrote:
 On 4/6/13 4:48 AM, Dan Douglas wrote:
  I couldn't find anything obvious in POSIX that implies which
  interpretation is
  correct. Assuming it's unspecified.
  
  Bash (4.2.45) uniquely does interpret such escapes for [[, which makes me 
  think this test should say no:
  
  x=\\x; if [[ x == $x ]]; then echo yes; else echo no; fi
  
  bash: yes
  ksh:  no
  mksh: no
  zsh:  no
  
  However, ksh93 (AJM 93v- 2013-03-17) is unique in that it flips the result
  depending on [[ ]] or case..esac (bug?), but otherwise it looks like a
  fairly random spread:
  
  x=\\x; case x in $x) echo yes;; *) echo no; esac
 
 These two cases should not be different.  They undergo the same expansions,
 except that the conditional command adds quote removal, which doesn't
 matter in this case.  In both cases, you ask the pattern matching code
 whether or not the string `x' matches the pattern `\x'.
 
 You invoke the same pattern matching code on the same patterns, why would
 you not get the same answer?

I expect they should be the same. I just noticed the discrepancy with ksh93 
and wondered what gives.

The original question I had in mind is: Is the quoting state of any part of a 
pattern determined lexically prior to expansions, or are any quotes/escapes 
within parts of pattern words that were generated by unquoted expansions re-
interpreted as quotes by the pattern matcher? I had always thought the former, 
but now it looks to me like all these shells are saying no because they 
interpret the expanded words for quoting to determine which parts of the 
pattern should be literal. This appears to even apply to pathname expansion.

 $ touch '\foo'
 $ ksh -c 'x=\\f* IFS=; printf %s\\n $x'
 \foo
 $ bash -c 'x=\\f* IFS=; printf %s\\n $x'
 \f*

I'm surprised different implementations are all across the board on this.
--
Dan Douglas