On Wed, Mar 28, 2007 at 10:46:01PM -0400, Daniel B. wrote:
> > For example, the unix "cut" program works automatically with UTF-8
> > text as long as the delimiter is a single byte, 
> 
> By "single byte," do you mean a character whose UTF-8 representation
> is a single byte?  (If you gave it the byte 0xBF, would it reject it
> as an invalid UTF-8 sequence, or would it then possibly cut in the middle
> of the byte sequence for a character (e.g., 0xEF 0xBF 0x00)?)

Apologies for omitting the word “character” after single byte. Yes, I
meant ASCII.

Rich

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to