Nick: You might consider using Microsoft's fallback mappings when converting to and from Microsoft encodings - for compatibility with Windows.
The Microsoft mapping tables on ftp.unicode.org do not show their fallback mappings, but there are mapping tables in the ICU tree that show them. These tables have been extracted from Windows using a tool that's in the ICU source tree as well. The files are not distributed in the normal ICU download but are provided as additional tables one can incorporate into the ICU 'dat' file if one chooses to do so. The mappings show both fallbacks (many Unicode-> one code page character) as well as 'reverse fallbacks' (many code page characters -> one Unicode). The tables on ftp.unicode.org show neither. You can find the tables under [ICU]/charset/data/ucm I believe. =Ed -----Original Message----- From: Nick Ing-Simmons [mailto:[EMAIL PROTECTED]] Sent: Tuesday, February 05, 2002 12:51 PM To: [EMAIL PROTECTED] Subject: Tossing another can of worms into the minefield... What does the list think of the idea of fallbacks for "common" approximations e.g. have Unicode->iso8859-1 map Microsoft cp1250 <U2018> \x91 |0 # LEFT SINGLE QUOTATION MARK <U2019> \x92 |0 # RIGHT SINGLE QUOTATION MARK Fallback map those to "'" <U201C> \x93 |0 # LEFT DOUBLE QUOTATION MARK <U201D> \x94 |0 # RIGHT DOUBLE QUOTATION MARK And those to '"' Likewise perhaps map iso8859-15's <U20AC> \xA4 |0 # EURO SIGN To <U00A4> \xA4 |0 # CURRENCY SIGN -- Nick Ing-Simmons http://www.ni-s.u-net.com/