locale(1) would be my first choice
but getonf(1) would be ok too

On Mon, 22 Oct 2012 01:34:38 +0200 Roland Mainz wrote:
> On Fri, Oct 19, 2012 at 3:38 PM, Cedric Blancher
> <cedric.blanc...@googlemail.com> wrote:
> > Request for enhancement: .sh.regex.available_character_class
> >
> > What do you think about adding a  .sh.regex.available_character_class
> > array variable which contains the list of available wctype character
> > classes for the current locale? I know there is no API to get a list
> > from the OS but libast could probe well-known names and put only those
> > in the array for which wctype() turned a non-0 value.

> IMO it's better to let "getconf" handle that job because these are
> locale properties which are not limited to the shell.
> AFAIK we need two different "getconf" properties - one for regex
> character classes and one for |wctrans()| transformations.
> I did some digging... and it seems Solaris 11 supports the following
> transformations (beyond POSIX  ; these are locale-dependant):
> -- snip --
> tojhira
> tojisx0201
> tojisx0208
> tojkata
> tolower
> toupper
> -- snip --
> ... Linux adds "totitle".

> Character classes (beyond POSIX  ; these are locale-dependant)
> supported by Solaris 11 are:
> -- snip --
> english
> gb
> ideogram
> jalpha
> jdigit
> jgen
> jgreek
> jhankana
> jhira
> jisx0201r
> jisx0208
> jisx0212
> jkanji
> jkata
> jparen
> jpunct
> jrussian
> jsci
> jspecial
> junit
> line
> number
> phonogram
> special
> wchar10
> wchar11
> wchar12
> wchar13
> wchar14
> wchar15
> wchar16
> wchar17
> wchar18
> wchar19
> wchar20
> wchar21
> wchar22
> wchar23
> wchar24
> wchar6
> wchar9
> -- snip --
> (note that some of these are errornously prefixed with "is" in some
> older Solaris versions). FreeBSD/OSX and Illumos add "rune" as extra
> class here.

> Glenn: What do you think about the idea of using "getconf" for this ?
> If you think this is OK then I can provide code who can test these
> "well-known" names (erm... including the "is"-prefix for character
> classes) for both (note that we cannot cache the values because they
> depend on LANG/LC_CTYPE/LC_ALL and IMO it's cheaper to probe the
> values each time "getconf" is called than trying to add more code for
> caching and tracking of the values of LANG/LC_CTYPE/LC_ALL).

> ----

> Bye,
> Roland

> -- 
>   __ .  . __
>  (o.\ \/ /.o) roland.ma...@nrubsig.org
>   \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
>   /O /==\ O\  TEL +49 641 3992797
>  (;O/ \/ \O;)

_______________________________________________
ast-developers mailing list
ast-developers@research.att.com
https://mailman.research.att.com/mailman/listinfo/ast-developers

Reply via email to