Re: Rename std.ctype to std.ascii?

Jonathan M Davis Tue, 14 Jun 2011 03:12:23 -0700

On 2011-06-14 02:51, David Nadlinger wrote:
> On 6/14/11 11:20 AM, Jonathan M Davis wrote:
> > On 2011-06-14 01:51, David Nadlinger wrote:
> >> But the functions in<ctype.h>  do. And there can be some
> >> locale-dependent problems even if you use only ASCII, the most prominent
> >> being the different handling of »i« in the Turkish locale:
> >> http://www.i18nguy.com/unicode/turkish-i18n.html
> >> 
> >> This is probably another reason why it shouldn't be called std.ctype…
> >> 
> >  From the looks of it, that affects extended ASCII but not ASCII (since
> >  the
> > 
> > Turkish uppercase I isn't even in ASCII). It's definitely a great link
> > though. Thanks!
> 
> Oh, I was probably a bit unclear – what I meant is that it affects you
> also if you use only ASCII input, since toupper('i') == 221 when your
> locale is tr_TR.ISO-8859-9.


Yes, but the result is extended ASCII, so it doesn't affect anything which 
only deals with pure ASCII. ctype.h deals with extended ASCII, so locales 
actually affect what it's doing. std.ctype only deals in pure ASCII, so it 
wouldn't do anything which would result in a non-ASCII character, and so 
locales shouldn't matter at all. However, if you _do_ want to bring locales 
into it, then a locale like tr_TR.ISO_8859-9 is not going to be able to 
operate purely in ASCII, since the uppercase value of i is 221, which is 
extended ASCII.

So, yes I understood. It's just that as far as I can tell, locales don't 
matter if you're completely restricting yourself to ASCII like std.ctype does. 
And std.ctype is not going to try and deal with locales at this point (and 
likely not ever). I think that that is far better left to unicode. The Turkish 
locale is a great example of why you _want_ to be dealing with unicode when 
dealing with locales. std.ctype is for when you're specifically restricting 
yourself to ASCII (which sometimes can be very useful - e.g. with formatting 
strings or regex strings where all of the special characters are ASCII; using 
unicode functions would just make them slower at no benefit and would risk 
changing behavior based on locale if you brought locales into it). If you're 
not restricting yourself to ASCII, then std.uni is the way to go.

- Jonathan M Davis

Re: Rename std.ctype to std.ascii?

Reply via email to