On 27.01.2016 13:29, Jim Jagielski wrote:
>> On Jan 27, 2016, at 4:44 AM, Branko Čibej <br...@apache.org> wrote:
>>
>>
>> Hmph, it's concise, not confusing. Subversion's APIs expect all strings
>> to be encoded in UTF-8, so the docstring can't just say
>> "case-insensitive" because that would be extremely misleading in that
>> context.
>>
>> APR makes no promises about the encoding, but mentioning that these
>> functions are designed to work with the ASCII subset (or EBCDIC
>> equivalent of same) would be quite important, I think?
> I have no idea how encoding matters at all to the meaning
> of case sensitivity... unless, somehow, 'A' and 'a' are
> encoded to the exact same value.

The important part is the bit about "unaccented Latin letters". Without
that clarification, "case-insensitive" in UTF-8 implies that the byte
sequence "\xC7\xBC" compares equal to "\xC7\xBD" (i.e., 'Ǽ' == 'ǽ'),
which is clearly not how that function works; and I'm ignoring fun
issues with Unicode normalization forms.

So yes, encoding does matter.

-- Brane

Reply via email to