I'm going to consider this _without_ looking at the ksh source, because
mortals will at most look at documentation (and because documentation
should be accurate enough that they shouldn't _have_ to look at source).

My very cursory reading of the man page* is a bit ambiguous whether that
should work:

       A  blank  is a tab or a space.  An identifier is a sequence of
letters,
       digits, or underscores starting with a letter or  underscore.
Identi-
       fiers  are used as components of variable names.  A vname is a
sequence
       of one or more identifiers separated by a . and optionally preceded
 by
       a  ..   Vnames  are  used  as function and variable names.  A word
is a
       sequence of characters from the character set defined  by  the
 current
       locale, excluding non-quoted metacharacters.

"A blank is a tab or a space" is more restrictive than "A word is a
sequence of characters from the character set defined by the current
locale, excluding non-quoted meta characters".  And if I try a vertical
tab, formfeed, or carriage return (all plain ASCII characters classified as
white space by isspace(3)) before "done", I get the same error.  So it
looks like the more restrictive interpretation holds: only tabs and the
basic space character are acceptable in the code as white space.  Of
course, anything should be ok in a quoted string (except whatever closes
the quotes); or rather, anything except a null byte, which does NOT work**
(ksh isn't perl - the latter goes out of its way to tolerate just about
anything).

However, I wouldn't do it, even if it should work, because that makes it
only work in an appropriate (UTF-8) locale; it would certainly be an error
regardless in C locale.  If it were me, I would only use anything not
sensible in C locale, within a quoted string constant; one does NOT want
code that does nasty things depending on what locale is in use.

* ${.sh.version} on my Mac is Version AJM 93u+ 2012-08-01, which I gather
is reasonably current. :-)

** the following produces an interesting error:

0000000    #   !       /   b   i   n   /   k   s   h  \n  \n   e   c   h
0000020    o       "  \0   t   e   s   t   i   n   g   "  \n
0000035
$ ./tryme.ksh
./tryme.ksh: syntax error at line 3: `zero byte' unexpected




On Tue, Apr 25, 2017 at 8:42 AM, lijo george <george.l...@gmail.com> wrote:

>
> Thanks for the suggestion Philippe.
> But I'm a bit confused though, Isn't "0xe3 0x80 0x80" the UTF-8
> representation of the space character.
>
>
> Thanks,
> Lijo
>
> On Tue, Apr 25, 2017 at 5:49 PM, Philippe Bergheaud <
> philippe.berghe...@fr.ibm.com> wrote:
>
>> > The attached testscript has a leading double byte space separator
>> > before the for loop closing "done" keyword. This fails with a syntax
>> > error while parsing.
>> >
>> > Is it a bug or is it expected behaviour?
>> >
>> > I've tried it with ksh93u+  and ksh93v- versions on a Solaris setup.
>> > bash and zsh also fails, hence I'm thinking it might not be a bug,
>> > but could someone please confirm this.
>> >
>> > Here's a sample output.
>> >
>> > root@S11_3_SRU:~# echo $LANG
>> > ja_JP.UTF-8
>> > root@S11_3_SRU:~# cat space.ksh
>> > #!/bin/ksh
>> > for i in 1 2
>> > do
>> > echo $i
>> > done   # leading  double byte space character
>> > root@S11_3_SRU:~# od -xc space.ksh
>> > 0000000    2321    2f62    696e    2f6b    7368    0a66    6f72    2069
>> >            #   !   /   b   i   n   /   k   s   h  \n   f   o   r       i
>> > 0000020    2069    6e20    3120    320a    646f    0a65    6368    6f20
>> >                i   n       1       2  \n   d   o  \n   e   c   h   o
>> > 0000040    2469    0ae3    8080    646f    6e65    0a00
>> >            $   i  \n 343 200 200   d   o   n   e  \n
>> You should remove the (invisible) character 0343 (0xe3), before the two
>> spaces.
>>
>> Philippe
>
>
>
> _______________________________________________
> ast-users mailing list
> ast-users@lists.research.att.com
> http://lists.research.att.com/mailman/listinfo/ast-users
>
>
_______________________________________________
ast-users mailing list
ast-users@lists.research.att.com
http://lists.research.att.com/mailman/listinfo/ast-users

Reply via email to