Michael mentioned to me that he had seen cases where
"RTL_CONSTASCII_STRINGPARAM" was accidentally used instead of
"RTL_CONSTASCII_USTRINGPARAM". Here's the problem...
#define RTL_CONSTASCII_STRINGPARAM( constAsciiStr ) constAsciiStr,
((sal_Int32)sizeof(constAsciiStr)-1)
which turns into two
#define RTL_CONSTASCII_USTRINGPARAM( constAsciiStr ) constAsciiStr,
((sal_Int32)(sizeof(constAsciiStr)-1)), RTL_TEXTENCODING_ASCII_US
which turns into three arguments, the third one being the encoding.
The problem arises when someone is using the old "String" class because
it has a load of constructors, two of which are...
UniString( const sal_Char* pByteStr, rtl_TextEncoding eTextEncoding,
sal_uInt32 nCvtFlags = BYTESTRING_TO_UNISTRING_CVTFLAGS );
and
UniString( const sal_Char* pByteStr, xub_StrLen nLen, rtl_TextEncoding
eTextEncoding, sal_uInt32 nCvtFlags =
BYTESTRING_TO_UNISTRING_CVTFLAGS );
So if someone uses RTL_CONSTASCII_USTRINGPARAM the right thing happens
and the better fitting second ctor is selected, char*, len, encoding all
filled in correctly.
On the other hand if someone uses RTL_CONSTASCII_STRINGPARAM, then the
better fitting first ctor is selected and the *ENCODING* argument is
filled in with the length of the string. Puke.
I should note that none of these errors are new or have been introduced
recently, but have been lurking in the code for quite a while.
Attached is a patch to the class String class which adds a higher
constructor that exactly matches the output of
RTL_CONSTASCII_STRINGPARAM but marks it private, detecting this misuse
at compile time, which should make this impossible to happen again, but
retain any correct uses of the first constructor where a real 16bit
rtl_TextEncoding is used, and not an implicitly downcasted sal_Int32.
I've applied this patch, and fixed a big pile of incorrect code that
falls out of it. What I will have missed is code which is only compiled
for non-Linux platforms, e.g. MacOSX or Windows specific code. So if you
get a compile failure about something being private in the String class
on those platforms the fix is probably trivially changing a STRINGPARAM
to USTRINGPARAM, otherwise let me know and I'll have a look.
C.
diff --git a/tools/inc/tools/string.hxx b/tools/inc/tools/string.hxx
index cb6b455..25cb2d4 100644
--- a/tools/inc/tools/string.hxx
+++ b/tools/inc/tools/string.hxx
@@ -474,6 +474,8 @@ private:
void operator +=(int); // not implemented; to detect misuses
// of operator +=(sal_Unicode)
+ //detect and reject use of RTL_CONSTASCII_STRINGPARAM instead of RTL_CONSTASCII_USTRINGPARAM
+ TOOLS_DLLPRIVATE UniString( const sal_Char*, sal_Int32 );
public:
UniString();
UniString( const ResId& rResId );
_______________________________________________
LibreOffice mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/libreoffice