At 2:48 AM -0700 4/22/04, Jeff Clites wrote:
On Apr 21, 2004, at 7:33 PM, Dan Sugalski wrote:

At 11:17 AM -0700 4/21/04, Jeff Clites wrote:
On Apr 21, 2004, at 10:20 AM, Dan Sugalski wrote:

Just to make sure... we're making sure the strings are always properly decomposed before comparing, right?

Nope, this is a literal "equal" comparison--you'd build a normalized compare on top of this.

I think this got caught on the list queue for a bit, and it's already been addressed, but just to be clear, Parrot's keeping decomposable characters decomposed, and generally normalizing, or at least pretending it's normalizing if it doesn't actually do so, when working with strings.

Yes, in order to define notions of normalized equivalence, you need a notion of strict equality on which to base them. string_equal() is the latter; the former are yet-to-be-coded.

I think, honestly, that for strings in a character set with multiple variants that are declared equal, I want them strict equality to be based on the canonical form of the strings rather than the binary form. For Unicode the standard defines them as identical, and if we have a mix of "really identical" and "logically identical" tests we're going to get a lot of subtle and damned annoying bugs seeping in.
--
Dan


--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to