Hi,

with OGo, I convert a UTF-8 string to lowercase, using [NSStrings 
lowercaseString]

when there are Umlauts in the string, then GNUstep just omits the character.
I've no idea, whether this is right or wrong actually.

With the attached patch below to GSString it does not omit the character 
anymore.


gcc -fgnu-runtime -fconstant-string-class=NSConstantString -I/usr/local/include 
-L/usr/local/lib -l gnustep-base lowercase.m -o lowercase

cat lowercase.m
#import <Foundation/Foundation.h>

 
int main(int argc, char *argv[]) {
        NSLog(@"Lowercase: %@", [[NSString stringWithString:@"Töst"] 
lowercaseString]);

}



Does above running the program on a Mac output the ö or omit it from the string?

does it change when running with LC_CTYPE="C" or LC_CTYPE='de_DE.UTF-8' ?

I don't have a Mac, so cannot test myself, maybe also the approach used by OGo 
could be wrong.
At least when reading the Apple docs, then there is nothing said about skipped 
characters,
only that i.e. a ß may change to SS when i.e. using uppercaseString.
Since they mentioned the ß in the documentation, I'd expect the lowercaseString 
to handle other Umlauts too, or is that just plain wrong assumption?

if someone could hit me with a cluestick please ;)

cheers,
Sebastian

the patch to not omit Umlauts.
$OpenBSD$
--- Source/GSString.m.orig      Tue Jul 31 18:31:36 2012
+++ Source/GSString.m   Tue Jul 31 18:32:24 2012
@@ -3699,6 +3700,8 @@ agree, create a new GSCInlineString otherwise.
   while (i-- > 0)
     {
       o->_contents.c[i] = tolower(_contents.c[i]);
+      if (o->_contents.c[i] == 0)
+       o->_contents.c[i] = _contents.c[i];
     }
   o->_flags.wide = 0;
   o->_flags.owned = 1; // Ignored on dealloc, but means we own buffer

_______________________________________________
Discuss-gnustep mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/discuss-gnustep

Reply via email to