Re: Use the wctype builtins functions

2010-03-12 Thread Daniel Jacobowitz
On Thu, Mar 11, 2010 at 10:46:42AM +0100, Paolo Bonzini wrote:
 On 03/05/2010 05:03 PM, Joseph S. Myers wrote:
 I don't know if there's an existing free software implementation of UAX#14
 (Unicode Line Breaking Algorithm) suitable for use in GCC; that would be
 the very heavyweight approach.
 
 Yes.  You can get it from gnulib like gdb does, or you can link
 libunistring (http://savannah.gnu.org/projects/libunistring).
 libunistring only supports UTF-{8,16,32} encodings though.

I don't think GDB actually does today.  But here's a prototype:

http://sourceware.org/ml/gdb-patches/2006-10/msg0.html

-- 
Daniel Jacobowitz
CodeSourcery


Re: Use the wctype builtins functions

2010-03-11 Thread Paolo Bonzini

On 03/05/2010 05:03 PM, Joseph S. Myers wrote:

I don't know if there's an existing free software implementation of UAX#14
(Unicode Line Breaking Algorithm) suitable for use in GCC; that would be
the very heavyweight approach.


Yes.  You can get it from gnulib like gdb does, or you can link 
libunistring (http://savannah.gnu.org/projects/libunistring). 
libunistring only supports UTF-{8,16,32} encodings though.


Paolo


Re: Use the wctype builtins functions

2010-03-10 Thread Shujing Zhao

On 03/06/2010 12:03 AM, Joseph S. Myers wrote:

On Fri, 5 Mar 2010, Ian Lance Taylor wrote:


Dave Korn dave.korn.cyg...@googlemail.com writes:


  I think you'll probably have to use plain old iswalpha.  Looking at opts.c,
I'm guessing you're trying to extend the help string format to allow unicode?

Note that it may be OK to use iswalpha strictly on command line
options, but using it anywhere else gets you into a set of issues
around -finput-charset and -fexec-charset.


The present issue is help text, as produced by gettext (which produces 
output in the locale's LC_CTYPE, calling iconv internally as needed).  See 
my discussion at http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01074.html 
of the issues with line breaking given a string of multibyte characters, 
whose display width may also vary.


I don't know if there's an existing free software implementation of UAX#14 
(Unicode Line Breaking Algorithm) suitable for use in GCC; that would be 
the very heavyweight approach.  I also don't know if that algorithm would 
actually work well for the peculiarities of option help strings, not 
having studied the details of it.  Hence the suggestion that the existing 
algorithm in opts.c could be reworked to check for L'-', L'/', L' ' and 
use iswalpha.


Thank you all to that replied. I finally include wctype.h on intl.c and use 
isw* strictly to handle the wide-character help string.
The updated patch for the issue that Joseph mentioned is at 
http://gcc.gnu.org/ml/gcc-patches/2010-03/msg00364.html. Any advices will be 
appreciated.


Thanks
Pearly



Re: Use the wctype builtins functions

2010-03-05 Thread Dave Korn
On 05/03/2010 02:32, Shujing Zhao wrote:
 Hi,
 
 I want to use the the wctype builtins ISWALPHA and the other ISW*
 functions to handle the wide character string, but I get the following
 error:
 
 /home/gcc/build/gcc/../../trunk/gcc/opts.c:1190: undefined reference to
 `ISWALPHA'
 collect2: ld returned 1 exist status
 
 I have tried to grep some examples that use the ISW* builtins, but
 didn't find any one. Does anyone know how to use them?

  The capitalised versions of the IS* functions are macros from safe-ctype.h,
not builtins, and it hasn't been extended with ISW* versions because it's
based on an array and it would need rather a large array to cope with wchars!

  I think you'll probably have to use plain old iswalpha.  Looking at opts.c,
I'm guessing you're trying to extend the help string format to allow unicode?

cheers,
  DaveK


Re: Use the wctype builtins functions

2010-03-05 Thread Ian Lance Taylor
Dave Korn dave.korn.cyg...@googlemail.com writes:

   I think you'll probably have to use plain old iswalpha.  Looking at opts.c,
 I'm guessing you're trying to extend the help string format to allow unicode?

Note that it may be OK to use iswalpha strictly on command line
options, but using it anywhere else gets you into a set of issues
around -finput-charset and -fexec-charset.

Ian


Re: Use the wctype builtins functions

2010-03-05 Thread Joseph S. Myers
On Fri, 5 Mar 2010, Ian Lance Taylor wrote:

 Dave Korn dave.korn.cyg...@googlemail.com writes:
 
I think you'll probably have to use plain old iswalpha.  Looking at 
  opts.c,
  I'm guessing you're trying to extend the help string format to allow 
  unicode?
 
 Note that it may be OK to use iswalpha strictly on command line
 options, but using it anywhere else gets you into a set of issues
 around -finput-charset and -fexec-charset.

The present issue is help text, as produced by gettext (which produces 
output in the locale's LC_CTYPE, calling iconv internally as needed).  See 
my discussion at http://gcc.gnu.org/ml/gcc-patches/2010-02/msg01074.html 
of the issues with line breaking given a string of multibyte characters, 
whose display width may also vary.

I don't know if there's an existing free software implementation of UAX#14 
(Unicode Line Breaking Algorithm) suitable for use in GCC; that would be 
the very heavyweight approach.  I also don't know if that algorithm would 
actually work well for the peculiarities of option help strings, not 
having studied the details of it.  Hence the suggestion that the existing 
algorithm in opts.c could be reworked to check for L'-', L'/', L' ' and 
use iswalpha.

-- 
Joseph S. Myers
jos...@codesourcery.com


Use the wctype builtins functions

2010-03-04 Thread Shujing Zhao

Hi,

I want to use the the wctype builtins ISWALPHA and the other ISW* functions to 
handle the wide character string, but I get the following error:


/home/gcc/build/gcc/../../trunk/gcc/opts.c:1190: undefined reference to 
`ISWALPHA'
collect2: ld returned 1 exist status

I have tried to grep some examples that use the ISW* builtins, but didn't find 
any one. Does anyone know how to use them?


Thanks
Pearly