On 02/09/2012 06:16 PM, Lubos Lunak wrote:
Attached is a patch that removes the need for RTL_CONSTASCII_USTRINGPARAM (henceafter called 'macro', because it annoys me to write this even in its death warrant) when using the OUString ctor. Assuming this code compiles also with MSVC[*], the result should be at least as good as when using the macro, except that it is not needed to use the macro.
We tried this back in the day, like maybe ca. 2000, and it did not work back then. Can't remember the details---hopefully it was just old MSVC that failed to get it and the MSVCs we rely on today have become better.
- OString is not included in the patch, as that one does have "const char*" ctor. Since it is inline, it actually does not matter for binary compatibility, so I can change this, but I didn't want to do this now before I get some feedback. It should be later changed as well.
I would assume there's a large number of implicit and explicit uses of OString(char const *) with non-literal arguments.
- I added an explicit check that when converting from an 8-bit string there is no \0 inside the string (i.e. that the length really matches), but the check actually triggers a couple of times. One case is editeng's ImpEditEngine ctor, where I'd expect the problem is real and somebody thought "\0xff" was "\xff", but I don't know for sure. Other case is sal's test_oustring_endswith.cxx , which intentionally creates OStrings with \0's in them and even operates on them, which suggests that OString in fact does not stop at \0's when handling strings. I'm kind of baffled by this, I personally consider it a flaw that a string class does this, but the unittest is very explicit about this. Does somebody know?
I'm kind of baffled by string classes that treat NUL specially. :) I consider the K&R approach of misusing one of the legitimate member values as sentinel as a bad mistake.
I however do not see anything wrong with automatic string literal to OUString conversions. UI strings, AFAICS, are not included in .cxx files themselves, but come from elsewhere, so sources contain only technical strings. Therefore it is a very fair assumption that all string literals are ASCII, since anything outside of ASCII would presumably need translation anyway, and the few legal cases (math symbols not in ASCII maybe) can simply be explicit about it (there will be a runtime warning if this is not the case, and since it is a string literal, it should be noticed by the developer).
We could change the ctor from using _ASCII_US to _UTF8. That way, the occasional non-ASCII literal could be included as \xXX\xXX\xXX. This is IMO orthogonal to the question of explicit-ness.
Moreover in fact the explicit OUString doesn't really buy us anything, as nobody says foo's arguments or 'str' really are OUString, so this is about as good as Hungarian notation.
Don't understand the above paragraph.
So while I can be talked into adding the explicit, I can't find any good reason for doing that.
Apart from the argument given by Caolán (which suggests we'd probably need to go with explicit, anyway), my gut feeling is that it would be better to keep things explicit here. There's already too many function overloads and implicit conversions around wrt to strings to not get the bad feeling that each additional one will subtly change some glorious code somewhere.
+ /** + * This overload exists only to avoid creating instances directly from (non-const) char[], + * which would otherwise be picked up by the optimized const char[] constructor. + * Since the non-const array cannot be guaranteed to contain only ASCII characters, + * this needs to be prevented. + * + * It is an error to try to call this overload. + * + * @internal + */ + template< int N >
nit: having to pick among int, std::size_t, sal_Int32 for the type of N, I would probably use std::size_t
+ OUString( char (&value)[ N ] ) + { +#ifndef RTL_STRING_UNITTEST + error_non_const_char_to_OUString_conversion_is_not_automatic( value ); +#else + (void) value; // unused + pData = 0; // for the unittest create an empty string + rtl_uString_new( &pData ); +#endif + }
I would turn this into just a declaration, with missing definition, for the non-RTL_STRING_UNITTEST case. (Unless we make the other ctor explicit, in which case I would think this one can go away completely.)
Stephan _______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice