Hi Doug, Yes, the UNICODE we are using is UTF16 (VC++). Yes, the control characters are entirely below 0x20 ASCII.
Thanks a lot for the information. So, I believe, we are safe using the conversion without breaking the hardware. Thanks and Regards, Abdij Bhat Kshema Technologies mailto:[EMAIL PROTECTED] www.kshema.com Phone:+91 80 860 3600 (Extension 2102) Fax: +91 80 860 3372 -----Original Message----- From: Doug Ewell [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 05, 2003 11:45 AM To: Unicode Mailing List Cc: Abdij Bhat Subject: Re: UTF8 and COntrol Characters Abdij Bhat <Abdij dot Bhat at kshema dot com> wrote: > If a UNICODE strings is converted to UTF8, will the UTF8 encoded > string contain and control character or escape sequences? If so, is it > possible to eliminate the same? By "UNICODE" I assume you mean UTF-16, which is one encoding form of Unicode (as is UTF-8). By "control character[s] or escape sequences" I assume you mean characters below 0x20 in ASCII, or below U+0020 in Unicode (any encoding form). (That is, I assume we are not talking about the so-called C1 control characters from U+0080 to U+009F.) A UTF-8 string will contain control characters if and only if the corresponding UTF-16 string contains characters in the control range (below U+0020). For strings that consist entirely of ASCII characters, the UTF-8 representation is identical to the ASCII representation. For the specification of UTF-8, including some examples that should help answer your questions, see http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf (pp. 24-25). -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

