On Sat, Feb 13, 2016 at 6:17 AM, Khaled Hosny <khaledho...@eglug.org> wrote: > I’m wondering if it is possible to rename sal_Unicode, which is actually > a unsigned 16 bit integer and thus can’t fit any Unicode character
s/any/every/ ? >, to > some less confusing name like sal_Ucs2 or even just use sal_uInt16 (but > please not sal_Utf16 which would give the illusion that surrogate pairs > and handled in a special way which I don’t think is true). Actually surrogate pairs are supported (or at least the intent is there :-) ). include/rtl/character.hxx:inline bool isHighSurrogate(sal_uInt32 code) { include/rtl/character.hxx:inline bool isLowSurrogate(sal_uInt32 code) { include/rtl/character.hxx:inline sal_Unicode getHighSurrogate(sal_uInt32 code) { include/rtl/character.hxx:inline sal_Unicode getLowSurrogate(sal_uInt32 code) { include/rtl/character.hxx:inline sal_uInt32 combineSurrogates(sal_uInt32 high, sal_uInt32 low) { etc... So yeah sal_Unicode is utf16, and it is quite common that utf16 is abusively called 'unicode'. 'Unicode' itself does not denote any specific encoding structure, hence utf-8, utf-16 and utf32 names, the latter 2 coming in BE and LE flavour. > > I count only ~7000 usages across the code base, so that is not such a > huge task. Internally it is doable, externally that is more of a problem, since sal_Unicode is part of the stable external API. The best you can do is to have an internal 'alias' for it. It may be indeed useful, for more clarity, to have typedef to be explicit about things, sal_utf8, sal_utf16, sal_utf16be, sal_utf16le, sal_utf32, sal_utf32be, sal_utf32le Norbert _______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/libreoffice