Re: [PATCH] STDCXX-1073

Martin Sebor Wed, 24 Oct 2012 08:12:18 -0700

On 10/24/2012 07:12 AM, Liviu Nicoara wrote:

A few observations after spending a few hours working on the improved test.


On 10/21/12 19:08, Martin Sebor wrote:

...
There's no requirement that embedded NULs must be preserved
(that's just how it happens to be implemented). I find it best
to avoid relying on the knowledge of implementation details
when exercising the library so that tests don't start failing
after a conforming optimization or some such tweak is added
to the code.


I agree with the implementation details part. However, there is only one
way NULs get processed and that is by keeping them in the string,
verbatim. In this respect the previous test incarnation was a stronger
test.


Consider an encoding where UCHAR_MAX doesn't correspond
to a valid character, and a hypothetical implementation that,
before transforming the string, increments the value of each
character by 1. That will lose all the initial embedded NULs.


I modified the test according to the suggestions. The test fails all
corresponding wide-char cases and I am investigating other potential
defects as well. For example, I do not think that employing strcoll and
wcscoll in compare is correct as they stop at the first NUL, although
strings may contain characters after the NUL that alter the result of
the comparison.


I would expect the wchar_t specialization to be analogous
to the narrow one. In fact (without looking at the code),
I would even think both could be implemented in terms of
the same function template specialized on the character
type and on the libc string function. (Although I'm not
necessarily suggesting this as the solution to this issue.)

Attached is a test case that reproduces the bug without relying
on implementation details. The test passes with your patch,
confirming that it's good. This test still makes the assumption
that strings that lexicographically distinct strings compare
unequal in the en_US locale, but I think that's a safe assumption
regardless of the encoding (though only for the hardcoded strings).


All narrow test cases pass fine, but I gotta look into the wide-char
failures.


Good to hear! I wish I had more time to look into it with you.

Martin


Liviu

Re: [PATCH] STDCXX-1073

Reply via email to