On Wed, Aug 26, 2009 at 1:21 PM, Ken Thomases<[email protected]> wrote:
> On Aug 26, 2009, at 10:43 AM, Michael Ash wrote:
>
>> On Wed, Aug 26, 2009 at 5:42 AM, Ken Thomases<[email protected]> wrote:
>>>
>>> On Aug 25, 2009, at 7:21 PM, Ross Carter wrote:
>>>
>>>>> I haven't tried it, but this should work:
>>>>>
>>>>>       NSAttributedString* original = whatever;
>>>>>       NSMutableAttributedString* normalized = [[original mutableCopy]
>>>>> autorelease];
>>>>>       CFMutableStringRef str = (CFMutableStringRef)[original
>>>>> mutableString];
>>>>>       CFStringNormalize(str, kCFStringNormalizationFormD);
>>>>>
>>>>> This works because -[NSMutableAttributedString mutableString] is a
>>>>> proxy
>>>>> that automatically fixes up the attribute runs held by its owner.
>>>
>>> Hmm, this seems dangerous in the sense that the conversion may be lossy.
>>>  As
>>> far as I can see, there's no guarantee that CFStringNormalize will
>>> perform
>>> minimal replacements.  If it does not, then whole ranges of characters
>>> may
>>> have their attributes reset to that of the first replaced character.
>>>
>>> Even if testing reveals it to be non-lossy under one testing environment,
>>> without a guarantee that might differ under any other testing
>>> environment.
>>
>> http://en.wikipedia.org/wiki/Unicode_equivalence
>>
>> [... quote snipped ...]
>
> I'm well aware of what it means.  The question is, which exact operations on
> the mutable string proxy does CFStringNormalize perform.  If
> CFStringNormalize performs the minimal replace operations to get the result,
> then it will preserve the attributes closely.  It's conceivable, though,
> that CFStringNormalize uses a side buffer to compute the normalized form and
> then does one big replace of the whole mutable string's range.  Or, anywhere
> in between.  Like, it might replace a series of precomposed characters with
> their decompositions all with one replace operation.  In that case, the
> attributes of most of the characters will be lost (replaced with the
> attributes of the first character in the replace range).
>
> So, it's clear that the _strings_ will always have a deterministic value as
> a result of normalization.  That's the point of normalization.  But the
> _attributed strings_ may not.

Fair enough. However, as Douglas pointed out, you aren't guaranteed
consistent results if you have multiple attributes within a single
decomposed character range *anyway*, so you're going to have trouble
regardless. Better to avoid that situation altogether.

Mike
_______________________________________________

Cocoa-dev mailing list ([email protected])

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to