Glenn, I think Roland was thinking about the getconf shell builtin,
because ksh93 does not have a locale(1) shell builtin, which could be
used to reflect such data.
Olga
On Mon, Oct 22, 2012 at 6:14 AM, Glenn Fowler wrote:
>
> ah but you may have been thinking getconf function and not getconf command
> in that case doing it with the getconf function is probably the way to go
>
> On Mon, 22 Oct 2012 00:10:28 -0400 Glenn Fowler wrote:
>> locale(1) would be my first choice
>> but getonf(1) would be ok too
>
>> On Mon, 22 Oct 2012 01:34:38 +0200 Roland Mainz wrote:
>> > On Fri, Oct 19, 2012 at 3:38 PM, Cedric Blancher
>> > wrote:
>> > > Request for enhancement: .sh.regex.available_character_class
>> > >
>> > > What do you think about adding a .sh.regex.available_character_class
>> > > array variable which contains the list of available wctype character
>> > > classes for the current locale? I know there is no API to get a list
>> > > from the OS but libast could probe well-known names and put only those
>> > > in the array for which wctype() turned a non-0 value.
>
>> > IMO it's better to let "getconf" handle that job because these are
>> > locale properties which are not limited to the shell.
>> > AFAIK we need two different "getconf" properties - one for regex
>> > character classes and one for |wctrans()| transformations.
>> > I did some digging... and it seems Solaris 11 supports the following
>> > transformations (beyond POSIX ; these are locale-dependant):
>> > -- snip --
>> > tojhira
>> > tojisx0201
>> > tojisx0208
>> > tojkata
>> > tolower
>> > toupper
>> > -- snip --
>> > ... Linux adds "totitle".
>
>> > Character classes (beyond POSIX ; these are locale-dependant)
>> > supported by Solaris 11 are:
>> > -- snip --
>> > english
>> > gb
>> > ideogram
>> > jalpha
>> > jdigit
>> > jgen
>> > jgreek
>> > jhankana
>> > jhira
>> > jisx0201r
>> > jisx0208
>> > jisx0212
>> > jkanji
>> > jkata
>> > jparen
>> > jpunct
>> > jrussian
>> > jsci
>> > jspecial
>> > junit
>> > line
>> > number
>> > phonogram
>> > special
>> > wchar10
>> > wchar11
>> > wchar12
>> > wchar13
>> > wchar14
>> > wchar15
>> > wchar16
>> > wchar17
>> > wchar18
>> > wchar19
>> > wchar20
>> > wchar21
>> > wchar22
>> > wchar23
>> > wchar24
>> > wchar6
>> > wchar9
>> > -- snip --
>> > (note that some of these are errornously prefixed with "is" in some
>> > older Solaris versions). FreeBSD/OSX and Illumos add "rune" as extra
>> > class here.
>
>> > Glenn: What do you think about the idea of using "getconf" for this ?
>> > If you think this is OK then I can provide code who can test these
>> > "well-known" names (erm... including the "is"-prefix for character
>> > classes) for both (note that we cannot cache the values because they
>> > depend on LANG/LC_CTYPE/LC_ALL and IMO it's cheaper to probe the
>> > values each time "getconf" is called than trying to add more code for
>> > caching and tracking of the values of LANG/LC_CTYPE/LC_ALL).
>
>> >
>
>> > Bye,
>> > Roland
>
>> > --
>> > __ . . __
>> > (o.\ \/ /.o) roland.ma...@nrubsig.org
>> > \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
>> > /O /==\ O\ TEL +49 641 3992797
>> > (;O/ \/ \O;)
>
> ___
> ast-users mailing list
> ast-us...@research.att.com
> https://mailman.research.att.com/mailman/listinfo/ast-users
--
, __ ,
{ \/`o;-Olga Kryzhanovska -;o`\/ }
.'-/`-/ olga.kryzhanov...@gmail.com \-`\-'.
`'-..-| / http://twitter.com/fleyta \ |-..-'`
/\/\ Solaris/BSD//C/C++ programmer /\/\
`--` `--`
___
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers