Re: Use the wctype builtins functions
On Thu, Mar 11, 2010 at 10:46:42AM +0100, Paolo Bonzini wrote: On 03/05/2010 05:03 PM, Joseph S. Myers wrote: I don't know if there's an existing free software implementation of UAX#14 (Unicode Line Breaking Algorithm) suitable for use in GCC; that would be the very heavyweight approach. Yes. You can get it from gnulib like gdb does, or you can link libunistring (http://savannah.gnu.org/projects/libunistring). libunistring only supports UTF-{8,16,32} encodings though. I don't think GDB actually does today. But here's a prototype: http://sourceware.org/ml/gdb-patches/2006-10/msg0.html -- Daniel Jacobowitz CodeSourcery
Re: Use the wctype builtins functions
On 03/05/2010 05:03 PM, Joseph S. Myers wrote: I don't know if there's an existing free software implementation of UAX#14 (Unicode Line Breaking Algorithm) suitable for use in GCC; that would be the very heavyweight approach. Yes. You can get it from gnulib like gdb does, or you can link libunistring (http://savannah.gnu.org/projects/libunistring). libunistring only supports UTF-{8,16,32} encodings though. Paolo
Re: Use the wctype builtins functions
On 03/06/2010 12:03 AM, Joseph S. Myers wrote: On Fri, 5 Mar 2010, Ian Lance Taylor wrote: Dave Korn dave.korn.cyg...@googlemail.com writes: I think you'll probably have to use plain old iswalpha. Looking at opts.c, I'm guessing you're trying to extend the help string format to allow unicode? Note that it may be OK to use iswalpha strictly on command line options, but using it anywhere else gets you into a set of issues around -finput-charset and -fexec-charset. The present issue is help text, as produced by gettext (which produces output in the locale's LC_CTYPE, calling iconv internally as needed). See my discussion at http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01074.html of the issues with line breaking given a string of multibyte characters, whose display width may also vary. I don't know if there's an existing free software implementation of UAX#14 (Unicode Line Breaking Algorithm) suitable for use in GCC; that would be the very heavyweight approach. I also don't know if that algorithm would actually work well for the peculiarities of option help strings, not having studied the details of it. Hence the suggestion that the existing algorithm in opts.c could be reworked to check for L'-', L'/', L' ' and use iswalpha. Thank you all to that replied. I finally include wctype.h on intl.c and use isw* strictly to handle the wide-character help string. The updated patch for the issue that Joseph mentioned is at http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00364.html. Any advices will be appreciated. Thanks Pearly
Re: Use the wctype builtins functions
On 05/03/2010 02:32, Shujing Zhao wrote: Hi, I want to use the the wctype builtins ISWALPHA and the other ISW* functions to handle the wide character string, but I get the following error: /home/gcc/build/gcc/../../trunk/gcc/opts.c:1190: undefined reference to `ISWALPHA' collect2: ld returned 1 exist status I have tried to grep some examples that use the ISW* builtins, but didn't find any one. Does anyone know how to use them? The capitalised versions of the IS* functions are macros from safe-ctype.h, not builtins, and it hasn't been extended with ISW* versions because it's based on an array and it would need rather a large array to cope with wchars! I think you'll probably have to use plain old iswalpha. Looking at opts.c, I'm guessing you're trying to extend the help string format to allow unicode? cheers, DaveK
Re: Use the wctype builtins functions
Dave Korn dave.korn.cyg...@googlemail.com writes: I think you'll probably have to use plain old iswalpha. Looking at opts.c, I'm guessing you're trying to extend the help string format to allow unicode? Note that it may be OK to use iswalpha strictly on command line options, but using it anywhere else gets you into a set of issues around -finput-charset and -fexec-charset. Ian
Re: Use the wctype builtins functions
On Fri, 5 Mar 2010, Ian Lance Taylor wrote: Dave Korn dave.korn.cyg...@googlemail.com writes: I think you'll probably have to use plain old iswalpha. Looking at opts.c, I'm guessing you're trying to extend the help string format to allow unicode? Note that it may be OK to use iswalpha strictly on command line options, but using it anywhere else gets you into a set of issues around -finput-charset and -fexec-charset. The present issue is help text, as produced by gettext (which produces output in the locale's LC_CTYPE, calling iconv internally as needed). See my discussion at http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01074.html of the issues with line breaking given a string of multibyte characters, whose display width may also vary. I don't know if there's an existing free software implementation of UAX#14 (Unicode Line Breaking Algorithm) suitable for use in GCC; that would be the very heavyweight approach. I also don't know if that algorithm would actually work well for the peculiarities of option help strings, not having studied the details of it. Hence the suggestion that the existing algorithm in opts.c could be reworked to check for L'-', L'/', L' ' and use iswalpha. -- Joseph S. Myers jos...@codesourcery.com
Use the wctype builtins functions
Hi, I want to use the the wctype builtins ISWALPHA and the other ISW* functions to handle the wide character string, but I get the following error: /home/gcc/build/gcc/../../trunk/gcc/opts.c:1190: undefined reference to `ISWALPHA' collect2: ld returned 1 exist status I have tried to grep some examples that use the ISW* builtins, but didn't find any one. Does anyone know how to use them? Thanks Pearly