Hi all

(especially Konstantin!)

When reviewing a change, I noticed that QString::startsWith with 
CaseInsensitive compares things like this:

            if (foldCase(data[i]) != foldCase((ushort)latin[i]))
                return false;

with foldCase() being convertCase_helper<QUnicodeTables::CasefoldTraits>(ch), 
whereas toLower() uses QUnicodeTables::LowercaseTraits.

There's a slight but important difference in a few character pairs see below. 
The code has been like that since forever. So I have to ask:

        => Is this intended?

If you write code like:

        qDebug() << a.startsWith(b, Qt::CaseInsensitive)
                << (a.toLower() == b.toLower());

You'll get a different result for the following pairs (for example, see util/
unicode/data/CaseFolding.txt for more):

µ U+00B5 MICRO SIGN
μ U+03BC GREEK SMALL LETTER MU

s U+0073 LATIN SMALL LETTER S
ſ U+017F LATIN SMALL LETTER LONG S

And then there are the differences between toUpper and toLower. The following 
pairs compare false with toLower(), compare true with toUpper(), currently 
compare false with CaseInsensitive/toCaseFolded() but *should* compare 
true[1]:

ß U+00DF LATIN SMALL LETTER SHARP S
ẞ U+1E9E LATIN CAPITAL LETTER SHARP S
SS

ʼn U+0149 LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
ʼN

ff U+FB00 LATIN SMALL LIGATURE FF
FF

[1] CaseFolding.txt says:
# The data supports both implementations that require simple case foldings
# (where string lengths don't change), and implementations that allow full 
case folding
# (where string lengths may grow). Note that where they can be supported, the
# full case foldings are superior: for example, they allow "MASSE" and "Maße" 
to match.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center

_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development

Reply via email to