Stephan Bergmann wrote:
> Herbert Duerr wrote:
>> To support characters outside of the unicode base plane I'd like to add a
>> new sal_UCS4 type to OpenOffice.
>> [...]
> See the thread at
> <http://www.openoffice.org/servlets/ReadMsg?listName=dev&msgNo=18462>
> for ideas how to change interfaces (sometimes it is better to replace
> sal_Unicode with rtl::OUString etc.).

Of course the complete string context is preferable to the current
16bit-sal_Unicode interface, but...
the rework cost to change all this is overkill in a lot of situations
though, e.g. to get a character directionality, to get the character
attributes needed for vertical layout, to get the mirrored character, to
get the spacing attribute (needed for "word underline"), to get a character
digit's localized equivalent, etc.

> UCS-2 and UCS-4 are not Unicode (<www.unicode.org>) terms.

Yup, they are ISO-10646 terms.

> sal_Unicode represents a UTF-16 code unit (without any ambiguity).

I hope we can agree that an interface a single UTF-16 code unit is a broken
design regarding characters outside the unicode base plane.

>> Of course the interfaces could be changed to something like sal_uInt32,
>> but then a lot of interesting type information would be lost.
> [...]
> Typedefs in C++ are, well, strange beasts.  As a client you often have
> to be aware of exactly what other type the typedef aliases (e.g., when
> declaring overloaded functions, when using varargs, printf, when
> determining whether there is an appropriate streaming operator <<, when
> building expressions on integer types).

At least the typedef make meanings more clear and if there ever is a problem
with ambiguity than changing the simple typedef to a more explicit type
with rigourosly defined conversions from/to other types is possible.

>> - unicode values beyond 2^32 are not unthinkable
> 
> How do you come to think that?  ;)

If your pragmatic approach of using sal_uInt32 is taken, then finding all
the places that would need to be adjusted would be much more costly than
simply finding and checking uses of sal_UCS4.

>> Did I miss any important issues against adding a sal_UCS4 type?
> 
> Would that be "sal_UCS4" or "sal_Ucs4"?

I'd prefer sal_UCS4 because I've never seen the abbreviation for "Universal
Character Set" spelled in non-caps.

Anyway, I'm already more than overloaded with work and don't have much time
for discussions about different tastes. I'll use sal_uInt32... :-(

--
Herbert

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to