RE: [Mono-list] conversions

Polton, Richard (IT) Wed, 06 Oct 2004 01:00:09 -0700

Thanks for this, although this begs another question :-)

If the char which is to be converted is 0661, say, then what will be the
value of the subtraction? Will it be 0661 - 0660 or will it be 0661 -
0030? I assume that a literal '0' will always map to 0030 rather than
cleverly detect the range of digits that the char belongs to.


Richard 

-----Original Message-----
From: Jonathan Pryor [mailto:[EMAIL PROTECTED] 
Sent: 05 October 2004 23:53
To: Polton, Richard (IT)
Cc: Jambunathan Jambunathan; [EMAIL PROTECTED]
Subject: RE: [Mono-list] conversions

A quick perusal through Perl's "Category.pl" shows this:

(1) Numbers are categorized as "Nd"
(2) The only ranges that are "Nd" seem to be:

        0030 - 0039     '0' - '9'
        0660 - 0669     ARABIC-INDIC DIGIT 0 - 9 (same order as ASCII)
        06F0 - 06F9     EXTENDED ARABIC-INDIC DIGIT 0-9 ("")
        0966 - 096F     DEVANAGRAI DIGIT 0-9
        09E6 - 09EF     BENGALI DIGIT 0-9
        0A66 - 0A6F
        0AE6 - 0AEF
        0B66 - 0B6F
        0BE7 - 0BEF
        0C66 - 0C6F
        0CE6 - 0CEF
        0D66 - 0D6F
        0E50 - 0E59
        0ED0 - 0ED9
        0F20 - 0F29
        ... Plus 8 more...

I'm too lazy to look at all of these ranges, but the ones I did look at
all had digits in the order 0..9.  The subtraction should be legal for
all of these glyphs.  (Which is probably by design; it would be very odd
-- broken? -- to have so many digits in the "right" order, and then have
a few in a different order...)

Gnome's Character Map program (gucharmap) is very handy for looking up
the Unicode Category a character belongs to.  Too bad the opposite
direction (Unicode Category -> characters) tends to be more difficult
(hence consulting Perl's internal tables).

 - Jon

On Tue, 2004-10-05 at 07:31, Polton, Richard (IT) wrote:
> Thanks for this. Is it fair to say, then, that only Arabic numerals 
> are counted as digits?  Even though other numeric characters have 
> integer values?
> 
> -----Original Message-----
> From: Jonathan Pryor [mailto:[EMAIL PROTECTED]
> Sent: 05 October 2004 11:32
> To: Polton, Richard (IT)
> Cc: Jambunathan Jambunathan; [EMAIL PROTECTED]
> Subject: RE: [Mono-list] conversions
> 
> On Tue, 2004-10-05 at 04:34, Polton, Richard (IT) wrote:
> >  In fact, habing given it further thought, I have a couple of
> questions:
> > 
> > i) if I sit at a Japanese terminal (for example) and enter '-', i.e.
> > ichi or 'one', is this a valid Unicode character?
> 
> Yes.
> 
> > ii) how wide is the 'char' datatype? I assume it contains Unicode 
> > rather than single-byte ASCII.
> 
> 16-bit unsigned value.  It supports Unicode.
> 
> > iii) if entering 'ichi' is valid, and char contains Unicode, then I 
> > suspect that the below subtration will return a number substantially

> > greater than one.
> 
> No.  At least, not if it's remotely like CVS HEAD:
> 
>       public static int Val (char Expression) {
>               if (char.IsDigit(Expression)) {
>                       return Expression - '0';
>               }
>               else {
>                       throw new ArgumentException();
>               }
>       }
> 
> Ichi isn't a digit, so it will generate an ArgumentException.
> 
> (Assuming that Ichi is Unicode U+4E00, which certainly looks like '-'.

> It's in the Unicode category "Letter, Other".)
> 
> The subtraction should be safe, as (1) it's only done on digits, and 
> (2) Unicode follows the ASCII character ordering (for glyphs 0-127), 
> which permits this subtraction.
> 
>  - Jon
> --------------------------------------------------------
>  
> NOTICE: If received in error, please destroy and notify sender.
Sender does not waive confidentiality or privilege, and use is
prohibited. 
>  
> _______________________________________________
> Mono-list maillist  -  [EMAIL PROTECTED] 
> http://lists.ximian.com/mailman/listinfo/mono-list 
--------------------------------------------------------
 
NOTICE: If received in error, please destroy and notify sender.  Sender does not waive 
confidentiality or privilege, and use is prohibited. 
 
_______________________________________________
Mono-list maillist  -  [EMAIL PROTECTED]
http://lists.ximian.com/mailman/listinfo/mono-list

RE: [Mono-list] conversions

Reply via email to