So I guess the observed behaviour is not a bug but intended behaviour. It's interesting that this used to work for the old ksh88 version, which might have been due to less complicated parsing mechanism.
Thanks, Lijo On Wed, Apr 26, 2017 at 12:57 AM, Richard Hamilton <rlham...@gmail.com> wrote: > I'm going to consider this _without_ looking at the ksh source, because > mortals will at most look at documentation (and because documentation > should be accurate enough that they shouldn't _have_ to look at source). > > My very cursory reading of the man page* is a bit ambiguous whether that > should work: > > A blank is a tab or a space. An identifier is a sequence of > letters, > digits, or underscores starting with a letter or underscore. > Identi- > fiers are used as components of variable names. A vname is a > sequence > of one or more identifiers separated by a . and optionally preceded > by > a .. Vnames are used as function and variable names. A word > is a > sequence of characters from the character set defined by the > current > locale, excluding non-quoted metacharacters. > > "A blank is a tab or a space" is more restrictive than "A word is a > sequence of characters from the character set defined by the current > locale, excluding non-quoted meta characters". And if I try a vertical > tab, formfeed, or carriage return (all plain ASCII characters classified as > white space by isspace(3)) before "done", I get the same error. So it > looks like the more restrictive interpretation holds: only tabs and the > basic space character are acceptable in the code as white space. Of > course, anything should be ok in a quoted string (except whatever closes > the quotes); or rather, anything except a null byte, which does NOT work** > (ksh isn't perl - the latter goes out of its way to tolerate just about > anything). > > However, I wouldn't do it, even if it should work, because that makes it > only work in an appropriate (UTF-8) locale; it would certainly be an error > regardless in C locale. If it were me, I would only use anything not > sensible in C locale, within a quoted string constant; one does NOT want > code that does nasty things depending on what locale is in use. > > * ${.sh.version} on my Mac is Version AJM 93u+ 2012-08-01, which I gather > is reasonably current. :-) > > ** the following produces an interesting error: > > 0000000 # ! / b i n / k s h \n \n e c h > 0000020 o " \0 t e s t i n g " \n > 0000035 > $ ./tryme.ksh > ./tryme.ksh: syntax error at line 3: `zero byte' unexpected > > > > > On Tue, Apr 25, 2017 at 8:42 AM, lijo george <george.l...@gmail.com> > wrote: > >> >> Thanks for the suggestion Philippe. >> But I'm a bit confused though, Isn't "0xe3 0x80 0x80" the UTF-8 >> representation of the space character. >> >> >> Thanks, >> Lijo >> >> On Tue, Apr 25, 2017 at 5:49 PM, Philippe Bergheaud < >> philippe.berghe...@fr.ibm.com> wrote: >> >>> > The attached testscript has a leading double byte space separator >>> > before the for loop closing "done" keyword. This fails with a syntax >>> > error while parsing. >>> > >>> > Is it a bug or is it expected behaviour? >>> > >>> > I've tried it with ksh93u+ and ksh93v- versions on a Solaris setup. >>> > bash and zsh also fails, hence I'm thinking it might not be a bug, >>> > but could someone please confirm this. >>> > >>> > Here's a sample output. >>> > >>> > root@S11_3_SRU:~# echo $LANG >>> > ja_JP.UTF-8 >>> > root@S11_3_SRU:~# cat space.ksh >>> > #!/bin/ksh >>> > for i in 1 2 >>> > do >>> > echo $i >>> > done # leading double byte space character >>> > root@S11_3_SRU:~# od -xc space.ksh >>> > 0000000 2321 2f62 696e 2f6b 7368 0a66 6f72 2069 >>> > # ! / b i n / k s h \n f o r >>> i >>> > 0000020 2069 6e20 3120 320a 646f 0a65 6368 6f20 >>> > i n 1 2 \n d o \n e c h >>> o >>> > 0000040 2469 0ae3 8080 646f 6e65 0a00 >>> > $ i \n 343 200 200 d o n e \n >>> You should remove the (invisible) character 0343 (0xe3), before the two >>> spaces. >>> >>> Philippe >> >> >> >> _______________________________________________ >> ast-users mailing list >> ast-users@lists.research.att.com >> http://lists.research.att.com/mailman/listinfo/ast-users >> >> >
_______________________________________________ ast-users mailing list ast-users@lists.research.att.com http://lists.research.att.com/mailman/listinfo/ast-users