Re: Exporting Unicode UTF-8 from Word

2004-11-23 Thread Otto Stolz
Hello, Cristian Secar wrote: how do you export UTF-8 from MS Word ? AFAIK, the only way to do that is to copy from Word paste to Notepad, then save as UTF-8. AFAIK, this works only under Windows XP. It does definitely not work under Win 98/NT. Alternatively, copy from Word paste to e-mail

Re: Ezra

2004-11-23 Thread Peter Kirk
On 22/11/2004 22:29, Christopher Fynn wrote: ... It doesn't really make it less of a hack since Windows just maps the glyphs encoded from from F020 to F0FF in the cmap of Windows Symbol fonts to characters x20-xFF in the Windows code page for your locale (normally Windows ANSI if you are in the

My Querry

2004-11-23 Thread Harshal Trivedi
How can i make sure that UTF-8 format string has terminated while encoding it, as compared to C program string which ends with '\0' (NULL) character? - Is there any special symbol or procedure to determine end of UTF-8 string OR just ASCII NULL '\0' is used as it is to indicate that. -- Harshal

Re: utf-8 and unicode fonts on LINUX

2004-11-23 Thread Otto Stolz
Kefas, you have written: I tried UTF-8 export to send an e-mail that contained several scattered unicode codepoints from the full 16-bit range from to from XP+Word to the university's Linux/Mozilla/OpenOffice/Kmail, enabled UTF-8 support. With very disappointing results. For UTF-8

RE: My Querry

2004-11-23 Thread Addison Phillips [wM]
Internationalization is an architecture. It is not a feature. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Harshal Trivedi Sent: 20041123 3:42 To: [EMAIL PROTECTED] Subject: My Querry How can i make sure that UTF-8 format string has terminated while

RE: My Querry

2004-11-23 Thread Mike Ayers
Title: RE: My Querry From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Addison Phillips [wM] Sent: Tuesday, November 23, 2004 9:14 AM One of the nice things about UTF-8 is that the ASCII bytes from 0 to 7F hex (including the C0 control characters from \x00 through

Re: My Querry

2004-11-23 Thread Kenneth Whistler
Harshal Trivedi asked: How can i make sure that UTF-8 format string has terminated while encoding it, as compared to C program string which ends with '\0' (NULL) character? You don't need to do anything special at all when using UTF-8 in C programs, as far as string termination goes. UTF-8

RE: My Querry

2004-11-23 Thread Mike Ayers
Title: RE: My Querry From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Harshal Trivedi Sent: Tuesday, November 23, 2004 3:42 AM How can i make sure that UTF-8 format string has terminated while encoding it, as compared to C program string which ends with '\0' (NULL)

RE: My Querry

2004-11-23 Thread Addison Phillips [wM]
Title: RE: My Querry (B (B (BHi Mike, (B (BYou misread my sentence, I think. I did NOT say that C language strings (Bare compatible with UTF-8, but rather that the UTF-8 was designed with (Bcompatibility with C language "strings" (char*) in mind. The (Bpoint of UTF-8 was actually to be

Re: My Querry

2004-11-23 Thread Philippe Verdy
From: Antoine Leca [EMAIL PROTECTED] I do not know what does mean fully compatible in such a context. For example, ASCII as designed allowed (please note I did not write was designed to allow) the use of the 8th bit as parity bit when transmitted as octet on a telecommunication line; I doubt such

Re: My Querry

2004-11-23 Thread Antoine Leca
Philippe Verdy écrivit: From: Antoine Leca [EMAIL PROTECTED] For example, ASCII as designed allowed (please note I did not write was designed to allow) the use of the 8th bit as parity bit when transmitted as octet on a telecommunication line; I doubt such use is compatible with UTF-8. The

Re: My Querry

2004-11-23 Thread Chris Jacobs
RE: My Querry (B- Original Message - (BFrom: Addison Phillips [wM] (BTo: Mike Ayers (BCc: [EMAIL PROTECTED] (BSent: Tuesday, November 23, 2004 8:15 PM (BSubject: RE: My Querry (B (B (B Hi Mike, (B (B You misread my sentence, I think. I did NOT say that C language strings are

RE: My Querry

2004-11-23 Thread D. Starner
Mike Ayers [EMAIL PROTECTED] writes: What is wrong? That UTF-8 (born FSS-UTF) was designed to be compatible with C language strings?' Yes. A character encoding can be compatible with ASCII or C language strings, but not both, as those two were not compatible to begin with.

Another Querry

2004-11-23 Thread Harshal Trivedi
How can i determine end of UCS-2/UCS-4 string while encoding it in C program? Normal C string ends with '\0' - ASCII NULL as terminating character.What symbol,pattern or character in UCS-2 or UCS-4 substitutes that ASCII NULL as termination symbol. -- Harshal P. Trivedi Software Engineer

Re: My Querry

2004-11-23 Thread Mark E. Shoulson
Why is it that even simple questions asked about straightforward aspects of Unicode somehow mutate into hairsplitting arguments about who exactly meant what and which version does which...? I'm glad I didn't ask this question here! ~mark

Re: My Querry

2004-11-23 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: By saying UTF-8 is fully compatible with ASCII, it says that any ASCII-only encoded file needs no reencoding of its bytes to make it UTF-8. Note that this is only true for the US version of ASCII (well, ASCII is normally designating

Re: My Querry

2004-11-23 Thread John Cowan
Antoine Leca scripsit: Sorry, no: there is no requirement to clear it. You are assuming something about the way data are handled. When you handle ASCII data using octets, you can perfectly, and conformantly, keep some other data (being parity or whatever) inside the 8th bit; so with even

Re: Another Querry

2004-11-23 Thread Doug Ewell
Harshal Trivedi harshal dot trivedi at gmail dot com wrote: How can i determine end of UCS-2/UCS-4 string while encoding it in C program? Normal C string ends with '\0' - ASCII NULL as terminating character.What symbol,pattern or character in UCS-2 or UCS-4 substitutes that ASCII NULL as