Re: Hex-byte pictures

2003-11-11 Thread Anto'nio Martins-Tuva'lkin
On 2003.11.10, 10:46, Philippe Verdy [EMAIL PROTECTED] wrote: However, some symbols used as function indicators are now quite omnipotent, and easily recognized with a well-defined meaning or function. Some of them are encoded in Wingdings or Webdings, but some others may merit their

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators in Unicode.

RE: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Simon Butcher
Hi Philippe! When dealing with protocol specifications, there's often a need for characters like these, too, since hex byte pictures are unambiguous. I have a DEC dumb terminal around here somewhere which also uses them when debugging control characters. I suppose you could argue it's

RE: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Simon Butcher
BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators in Unicode. Else it would mean that

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: - the attachment symbol (trombonne in French, Brobriefklammer in German, I don't know the term in English), Paper clip. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

Byte order mark (?) mars Unicode homepage

2003-02-14 Thread Michael Everson
Under Mac OS X, Explorer 5.2.2 displays a euro sign above the red title bar, creating a white bar which pushes the red bar down. This doesn't occur on other pages. Safari doesn't display the euro sign but the white bar is there. Same for OmniWeb. I tried to use UnicodeChecker in the OS X

Double Byte Character Set (DBCS)

2002-11-13 Thread Magda Danish (Unicode)
device but was developed primarily for English speaking countries. We are now looking to expand the market for this product into countries such as China. To achieve this I have been informed we need to enable our application for Double Byte Character Set (DBCS). This confuses me when I

Re: Double Byte Character Set (DBCS)

2002-11-13 Thread Markus Scherer
-Original Message- We are now looking to expand the market for this product into countries such as China. To achieve this I have been informed we need to enable our application for Double Byte Character Set (DBCS). DBCS is an old, pre-Unicode term for character sets with Chinese

RE: Euro in Windows double-byte code pages

2002-03-27 Thread Cathy Wissink
: Doug Ewell [mailto:[EMAIL PROTECTED]] Ken Krugler [EMAIL PROTECTED] is wondering: I'm wondering if anybody (say, for example, Kenneth) has information on the addition of the Euro to the various Windows double-byte code pages (CP950, CP932, CP936, CP949). In particular, was the the Euro added

Euro in Windows double-byte code pages

2002-03-26 Thread Ken Krugler
on the addition of the Euro to the various Windows double-byte code pages (CP950, CP932, CP936, CP949). In particular, was the the Euro added to CP932 for Windows 2000-J (at location 0x80), or is this just a wild rumor? Thanks, -- Ken -- Ken Krugler TransPac Software, Inc. http://www.transpac.com +1 530

Re: Euro in Windows double-byte code pages

2002-03-26 Thread Doug Ewell
Ken Krugler [EMAIL PROTECTED] is wondering: I'm wondering if anybody (say, for example, Kenneth) has information on the addition of the Euro to the various Windows double-byte code pages (CP950, CP932, CP936, CP949). In particular, was the the Euro added to CP932 for Windows 2000-J

Re: Euro in Windows double-byte code pages

2002-03-26 Thread Jungshik Shin
On Tue, 26 Mar 2002, Ken Krugler wrote: I'm wondering if anybody (say, for example, Kenneth) has information on the addition of the Euro to the various Windows double-byte code pages (CP950, CP932, CP936, CP949). In particular, was the the Euro I'm not sure of CP950/932/936, but CP949

RE: How to print the byte representation of a wchar_t string with non-ASCII ...

2001-11-02 Thread Tay, William
])); printf(Byte rep: ); for (i = 0; i strlen(argv[1]); i++) printf(%02X , argv[1][i]); mbstowcs(wstr, argv[1], 20); printf(stdin in WC: %ls, wcslen: %d\n, wstr, wcslen(wstr)); // Guess this is the only way to see the byte rep of wstr string wcstombs(mstr, wstr, 20); printf(Byte

RE: How to print the byte representation of a wchar_t string with non -ASCII ...

2001-11-02 Thread Addison Phillips [wM]
, William Sent: Friday, November 02, 2001 9:38 AM To: Unicode Mailing List Subject: RE: How to print the byte representation of a wchar_t string with non -ASCII ... Dear Unicoders C gurus, Thank you for your comments on my previous posting. They help. Have a question while digesting them on machine

RE: How to print the byte representation of a wchar_t string withnon -ASCII ...

2001-11-02 Thread Jungshik Shin
Hi, Since Addison already wrote about one aspect of your problem, I'll raise another issue. mbstowcs(wstr, argv[1], 20); printf(stdin in WC: %ls, wcslen: %d\n, wstr, wcslen(wstr)); // Guess this is the only way to see the byte rep of wstr string This is not the case. You may

Re: How to print the byte representation of a wchar_t string withnon -ASCII ...

2001-11-01 Thread Jungshik Shin
are converted to a byte stream according to the currently selected locale. Eventually it has But won't this approach fail as soon as we hit a 0x00 byte (i.e. the high 8 bits of any Latin-1 character)? I'm not sure what you're alluding to here. As long as all characters in wstr belong

Re: How to print the byte representation of a wchar_t string with non -ASCII ...

2001-11-01 Thread DougEwell2
In a message dated 2001-11-01 12:23:58 Pacific Standard Time, [EMAIL PROTECTED] writes: But won't this approach fail as soon as we hit a 0x00 byte (i.e. the high 8 bits of any Latin-1 character)? I'm not sure what you're alluding to here. As long as all characters in wstr belong

How to print the byte representation of a wchar_t string with non-ASCII chars?

2001-10-31 Thread Tay, William
Hi, For debugging purpose, I'd like to find out how I can print the byte representation of a wchar_t string. Say in C, I have wchar_t wstr[10] = Lfran; Is there any printf or wchar equivalent function (using appropriate format template) that prints out the string as 66 72 C3 A1 6E

Re: The byte

2001-09-15 Thread $B$F$s$I$&$j$e$&$8(B
h(B: [EMAIL PROTECTED]; Cc: $BF|;~(B: 01/09/15 4:39 $B7oL>(B: Re: The byte [EMAIL PROTECTED] wrote: The existence of the byte sucks. Well, I suggest therefore that you do Civilizaton a favor and incidentally leave your indelible Mark on History by devoting every waking moment of the

The byte

2001-09-14 Thread $B$F$s$I$&$j$e$&$8(B
The existence of the byte sucks. rubyrb$B$8$e$&$$$C$A$c$s(B/rbrp(/rprtJuuitchan/rtrp)/rp/ruby Well, I guess what you say is true, I could never be the right kind of girl for you, I could never be your woman - White Town --- Original Message --- $B:9=P?M(B: Ma

Re: The byte

2001-09-14 Thread Rick McGowan
[EMAIL PROTECTED] wrote: The existence of the byte sucks. Well, I suggest therefore that you do Civilizaton a favor and incidentally leave your indelible Mark on History by devoting every waking moment of the rest of your life to stamping out the accursed byte. Rick

RE: Byte Order Marks

2001-04-20 Thread Yves Arrouye
Then why is ICU mapping UTF-16 to UTF16_PlatformEndian and not UTF16_BigEndian? ICU does not do Unicode-signature or other encoding detection as part of a converter. When you get text from some protocol, you need to instantiate a converter according to what you know about the

RE: Byte Order Marks

2001-04-20 Thread Yves Arrouye
On Thu, Apr 19, 2001 at 06:24:47PM -0700, Markus Scherer wrote: On the other hand, if you get a file from your platform and it is in 16-bit Unicode, then you would appreciate the convenience of the auto-endian alias. But nothing should be spitting out platform-endian UTF-16! In the

Re: Byte Order Marks

2001-04-20 Thread Markus Scherer
, there is more work necessary. It seems to me that a converter API with its ability to take one byte at a time, and no other way to pass additional information ("I know the language of the text..."), is not the best way to implement this. On output, writing a BOM/signature is easy: if you kno

Byte Order Marks

2001-04-19 Thread Tomas McGuinness
Hi, A quick question relating to the Byte Order Mark of UCS-2. If its absent is it safe to assume any particular order (i.e. Big or Little Endian?). I am writing a function to rearrange from Big to little endian but without a byte order mark I'm not sure what the order is. Is there any

Re: Byte Order Marks

2001-04-19 Thread Markus Scherer
There is an RFC about UTF-16 that explains this: If the text is labeled by the protocol as charset=UTF-16 then the first two bytes are the byte order mark charset=UTF-16BE then it is big-endian and the first two bytes are just text charset=UTF-16LE then it is little-endian and the first two

RE: Byte Order Marks

2001-04-19 Thread Yves Arrouye
If you don't have any clue about the byte order, but you know it is UTF-16, then assume BE. Then why is ICU mapping UTF-16 to UTF16_PlatformEndian and not UTF16_BigEndian? I know that was a difference between ICU and my library, and when I asked this question a while ago I was told that despite

Fwd: Re: Byte Order Marks

2001-04-19 Thread Asmus Freytag
Date: Thu, 19 Apr 2001 12:59:43 -0700 To: Tomas McGuinness [EMAIL PROTECTED] From: Asmus Freytag [EMAIL PROTECTED] Subject: Re: Byte Order Marks At 02:58 PM 4/19/01 +0200, you wrote: If its absent is it safe to assume any particular order (i.e. Big or Little Endian?) The default order is Big

Re: Byte Order Marks

2001-04-19 Thread Markus Scherer
Yves Arrouye wrote: If you don't have any clue about the byte order, but you know it is UTF-16, then assume BE. Then why is ICU mapping UTF-16 to UTF16_PlatformEndian and not UTF16_BigEndian? ICU does not do Unicode-signature or other encoding detection as part of a converter. When you

Re: Byte Order Marks

2001-04-19 Thread David Starner
On Thu, Apr 19, 2001 at 06:24:47PM -0700, Markus Scherer wrote: On the other hand, if you get a file from your platform and it is in 16-bit Unicode, then you would appreciate the convenience of the auto-endian alias. But nothing should be spitting out platform-endian UTF-16! In the case that

Byte Order Marks

2001-04-10 Thread Tomas McGuinness
Hi, When looking at a document would it be safe to assume that if you found any of the following Byte Order Marks * 0xFFFE (UCS-2 Little Endian) * 0xFEFE (UCS-2 Big Endian) * 0xEFBBBF (UTF-8) That the document is encoded with that encoding format. That means that if I found

Re: Byte Order Marks

2001-04-10 Thread DougEwell2
In a message dated 2001-04-10 3:04:09 Pacific Daylight Time, [EMAIL PROTECTED] writes: When looking at a document would it be safe to assume that if you found any of the following Byte Order Marks *0xFFFE (UCS-2 Little Endian) *0xFEFE (UCS-2 Big Endian) should be 0xFEFF

Octet vs byte (was Unicode FAQ Addendum)

2000-07-22 Thread Patrick Andries
- Message d'origine - De : "Doug Ewell" [EMAIL PROTECTED] À : "Unicode List" [EMAIL PROTECTED] Envoyé : 22 juillet, 2000 21:24 Objet : Re: Unicode FAQ addendum John G. Otto, alias "jgo" [EMAIL PROTECTED], wrote: Addison wrote: 1. 1 byte != 1 ch