Michael Meeks wrote:
- rtl_uString_internConvert should signal an error if conversion fails (RTL_TEXTTOUNICODE_FLAGS_xxx_ERROR).

        I was following the pattern here from rtl_string2UString are you
certain you want (inconsistent?) error handling for these methods ? easy
enough to add if so.

The rtl_string2UString direction (OString -> OUString) is not that problematic, as usually there should not be translation problems. However, OUString -> OString can very well run into problems of missing characters in the target encoding. That's why I had to introduce error-signalling rtl_convertUStringToString.

sal/rtl/source/strimp.h:
- SAL_INTERLOCK_BITS (sizeof(oslInterlockedCount)*8) instead of (sizeof(sal_Int32)*8), the type of refCount, got me confused.

        We can have:

#define SAL_INTERLOCK_BITS (sizeof(((rtl_uString *)(NULL))->refCount) * 8)

        If that is clearer for you. Incidentally - why is the refCount not of
type oslInterlockedCount - when we assume it is all over the place ?
[ would you accept a correction to the [u]string structures to use the
oslInterlockedCount type instead ?

That jumbo is probably more confusing than anything else. ;) I do not know why we have the inconsistency in the first place (most probably because somebody was not coding carefully), and I only wanted to express that it confused me, but either oslInterlockedCount or sal_Int32 in SAL_INTERLOCK_BITS is probably just fine in the end. [And yes, a correction would not hurt, either.]

sal/rtl/source/ustring.c:1.24.38.3:
- sal_Bool might be better than int CAN[NOT]_RETURN.

        as you like - since it deals with lifecycle, I find this clearer to
understand where the ownership is being transferred - but there may be
better names for the defines ?

Or an enum?

- Why SAL_STRING_INTERN_FLAG? Why not change rtl_uString_intern_internal so that it works correctly on already interned strings.

        Apart from the freeing performance - it also makes 'interning' a very
cheap operation if it is already interned; since I want to do this for
eg. for every string in a calc cell, it's nice if we can avoid a hash
lookup wherever possible.

That argument alone would sound a bit like premature optimization. ;) But, as you already pointed out, it is necessary anyway.

Also, what just appeared to me, INTERN_FLAG helps avoid problems: While rtl::O[U]String are effectively non-mutable, tools' {Byte|Uni}String have mutating functions like SetChar which do COW. Fortunately, that COW is done based on refCount == 1, so that it cannot happen that an interned (but only once referenced, refCount == INTERN_FLAG|1) string is mutated and would thus end up in a potentially wrong hash bucket. (What is tricky is that refCount from rtl_[u]String is called mnRefCount in {Byte|Uni}StringData. Grep for "mnRefCount == 1" in tools strimp.cxx/strascii.cxx. Why did nobody clean up and remove {Byte|Uni}StringData long ago? Grr...)

        I'm just resynching the CWS to m202, and I'm have a few touches around
the source tree that drastically reduce string duplication :-)
[ those changes are of course rather small, mostly a load of '.intern()'
calls ]; hopefully non-controversial (some accidentally reverse diffs
FWIW ;-) http://go-oo.org/~michael/size-intern.diff  ( are you ok with
the 'Tools' intern call there too ? ).

(yes, sure)

        Of course, I'd like to avoid adding vcl, psprint, framework, configmgr,
sc to the CWS until the very last minute (to avoid resynching pain), so
I'll wait until you're happy :-)

I'm happy, go ahead.  :)

-Stephan

        Thanks,

                Michael.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to