Eike Rathke wrote:
2 With the current setup where moving past the beginning or end of the
string is undefined behavior, is there any use for
postIncrementCodePoints outside [-1 .. 1]?
There may be in scenarios like "next I'll be interested in the character
after the next", so postIncrementCodePoints would be 2.
My point was that you can only safely make that call if you know that
there are at least two more code points after the current index, which
in general you can only know if you inspect the "surrogate structure" of
the OUString at the sal_Unicode level (which iterateCodePoints should
shield you from).
True. So, then I assume we don't need other postincrement values.
Think so too. Anyway, having the more general case available (even if
probably not of much use) does not really hurt, so I will leave that in.
Or would there be legitimate
use cases for rtl_uString_iterateCodePoints to stop moving past the
beginning/end of the string when postIncrementCodePoints is too large?
I think it should stop if it is called with indexUtf16 being "outside"
the string, or resulting in such a value, so -1 and length would be the
min/max resulting values. Also,
Why -1 instead of 0?
I thought of -1 signalling an end condition in reverse iteration, as
does 'length' in forward iteration, both point "outside" the string and
would follow the general [...[ inclusive/exclusive approach.
But what should
sal_Int32 i = -1;
s.iterateCodePoints(&i, 1);
mean then? Pseudo-iterate forward to i == 0?
But you are right, reverse-iterating code does look more awkward. Would
it help if postIncrementCodePoints actually acted as
preIncrementCodePoints if it is negative? Is not that what we want? Or
is it to confusing?
sal_Int32 i = s.getLength();
while (i != 0) {
sal_uInt32 c = s.iterateCodePoints(&i, -1);
}
would then neatly reverse-iterate through any string, and we would get
rid of the ugly SAL_MAX_UINT32 special-case return value.
-Stephan
A forward loop would look like
for(i=0; i<s.getLength(); )
{
c = s.iterateCodePoints( &i, +1);
}
A similar reverse loop
for(i=s.getLength(), s.iterateCodePoints( &i, -1); i>=0; )
{
c = s.iterateCodePoints( &i, -1);
}
would not work if 0 was the smallest indexUtf16 value returned in i, one
would have to insert an if(i==0)break; condition at the end of the loop,
quite ugly.. Furthermore the length had to be checked in advance as well
to not enter the loop with an empty string. Altogether nasty, I'd say.
Eike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]