For the purpose specified, isLatin1 should just test for <= 0xFF. After all,
one would not want to exclude TAB, CR or LF ☺

Mark
----- Original Message -----
From: "John Cowan" <[EMAIL PROTECTED]>
To: "Unicode List" <[EMAIL PROTECTED]>
Sent: Thursday, October 05, 2000 10:33
Subject: Re: Correct definition for an "isLatin1()" function


> "Rogers, Paul" wrote:
>
> > We're whipping up a little function named isLatin1() that returns true
if
> > the (UCS-2) string in question is "all Latin1".
>
> [snip]
>
> > In other words, should we exclude the C0, C1, and Latin Extended code
> > values?
>
> Including or excluding C0 and C1 is a matter of taste.  If you mean
> "strictly containing characters in ISO 8859-1", then they're out.
> If you mean "representable in typical Latin-1 text files", then at least
> C0 is in, and C1 will do no great harm.  (Provided your Unicode
> characters don't originate from incorrect transcoding from CP 1252.)
>
> The Latin Extended blocks are definitely out.
>
> --
> There is / one art                   || John Cowan
<[EMAIL PROTECTED]>
> no more / no less                    || http://www.reutershealth.com
> to do / all things                   || http://www.ccil.org/~cowan
> with art- / lessness                 \\ -- Piet Hein

Reply via email to