Re: RFC: Case-Insensitive Strings (And usually they really do havecase)

Daniel Gibson Mon, 10 Jan 2011 13:35:18 -0800

Am 10.01.2011 22:16, schrieb Michel Fortin:

On 2011-01-10 13:46:55 -0500, "Nick Sabalausky" <[email protected]> said:

Not carrying any other data means not caching the lowercase version, which
means recreating the lowercase version more than necessary. So it's the
classic speed vs. space tradeoff. I would think there would be cases where
they get compared enough for that to make a difference, although I suppose
we'd really need benchmarks to see. OTOH, there are certainly cases (such as
my original motivating case) where the extra space is not an issue at all.


Comparing the lowercase version of two strings works well for ASCII, but I doubt
it works very well for Unicode. Case conversion is not bidirectional (for
instance both 'SS' and 'ß' become 'ss' in lowercase in German),

That's wrong, 'ß' is lowercase and no upper-case version is used really, thoughone exists in Unicode (see: http://en.wikipedia.org/wiki/Capital_%C3%9F ).Sometimes, when stuff is written in fullcaps, 'ß' (which never is the firstcharacter of a word) is replaced by "SS", but I wouldn't expect that to be equalon icmp(). (e.g. "Strings vergleichen macht keinen Spaß!" vs "STRINGSVERGLEICHEN MACHT KEINEN SPASS!")

Anyway, in this case comparing in lowercase would cause no trouble at all(comparing in uppercase however would, if you don't use thenot-really-existing-but-defined-by-unicode-Capital-ß).

I don't know if there may be problems with special characters in otherlanguages, though.

and what's equal
and what is not sometime depends on the language.

Checking for string equality is a special case of the Unicode collation
algorithm. I'm not sure if implementing this part of Unicode is in the scope of
Phobos (probably not), but short of having Unicode support it seems the utility
of having a special string type dedicated to ASCII case-insensitive strings is
quite limited.

Re: RFC: Case-Insensitive Strings (And usually they really do *have*case)

Reply via email to

Re: RFC: Case-Insensitive Strings (And usually they really do havecase)