On Tue, Dec 10, 2002 at 01:43:44PM -0500, Jungshik Shin wrote:
> 
> On Tue, 10 Dec 2002, Maiorana, Jason wrote:
> 
> > >> If this is not the case, is there any locale which will correctly
> > >> ctype() all of unicode?
> > >
> > >  There's NO single 'correct' way although there can be a 'generic'
> > >isupper, islower, toupper, tolower and so forth work differently
> > >on the language/region of the locale.
> >
> > The unicode standard itself seems to provide standard mappings of
> > upper, lower, and title case. The locale system does not seem to
> 
>   Unicode standard does  provide the *default*, but that default can be
> tailored and overridable depending on language/locale/region.
> That is, what's correct for English may not be correct
> for Turkish, Irish, Swedish, Dutch, Russian and Bulgarian
> however minor those differences might be. That's what I meant when
> I wrote that there is not 'the' correct way.

Also 14652 provides a default values for upper and lower. 
The default values in 14652 is actually taken from an earlier version
of the locales in glibc. I am not sure, but I think this is what
glibc then still uses, and thus not unicode tables.

> > but I dont see why an "isspace" function couldnt
> > work correctly for all of unicode/all languages.
> 
>   I also think that *some* categories in LC_CTYPE
> appear to be language-neutral, but I can't be 100% sure. You never know.

I would think many of the LC_CTYPE categories are language neutral.
> 
> > Duplicating the full case conversion tables for all installed
> > locales does neem a bit redundant... Instead maybe a small file
> > like:
> 
>   No doubt there should be an efficient way to share what's common
> across lang/region/scriptsh and store only the 'tailoring delta'
> separately for each lang/region/script.  Well, someone might say that
> disk is cheap.....

14652 has tailoring for sorting, but not for LC_CTYPE. We are looking
at revising the just approved 14652, so proposals are welcome.

Kind regards
keld
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to