[moving from evolution-patches to evolution-hackers] Giving the user a choice could work. We can't *just* autodetect based on the UTF8. In a string like "The character for the word 'one' is <<U+4E00>>", the last character could be Japanese, Simplified Chinese, or Traditional Chinese (or even Korean sometimes?).
Is there any way for the composer to know whether the user is using a Japanese or Chinese input method? (And are there separate traditional and simplified chinese input methods?) And what about cut+paste? If you paste characters from a Big5 web page, does the composer know that or does it only get UTF8? -- Dan On Wed, 2002-05-01 at 21:28, Not Zed wrote: > Yes we need this code, as we needed it when it was written. > > If nothing else, we could potentially use it to offer the user a choice > (as emacs does), or use it to determine if the users locale charset is a > valid option, or even for things like autodetecting unknown data (using > locale as a hint). > > The code is priority based at least. So you just order the super-meta > charsets last, so they wont be chosen for normal text, and maybe even > special case them based on locale so utf8 is usually preffered. > > On Wed, 2002-05-01 at 21:42, Dan Winship wrote: > > > Order of preference seems to be iso-2022-jp, Shift-JIS, and then euc-jp > > > but neither Shift-JIS nor euc-jp are liked very much. They seem to only > > > be common in the US for example. > > > > > > Korean users tend to prefer euc-kr over iso-2022-kr. > > > > Do the character sets actually contain vastly different data? Will > > Shift-JIS, euc-jp, or iso-2022-kr ever get chosen? > > For that matter, will the Chinese charsets ever get autodetected or will > > it always use the Japanese ones instead (at least for messages > > containing only reasonably common characters)? > > > > Also, does this patch address the issue that a message containing both > > Greek and Russian *can* be encoded in iso-2022, but *should* be encoded > > in UTF8? > > > > What problem exactly is this supposed to be solving? If you want to > > autodetect Asian charsets for people who aren't replying to an > > Asian-language message and don't have an Asian locale, I don't think > > this will work. > > > > Heuristics that might work are "if it contains Korean characters (which > > are all in a certain range in Unicode), try EUC-KR", "if it contains > > Japanese hiragana/katakana (likewise), try iso-2022-jp", and "if it > > contains unihan characters but not kana, it's probably Chinese". I don't > > think you can autoselect between traditional and simplified Chinese > > charsets based on a UTF8 input stream though. > > > > -- Dan > > > > > > _______________________________________________ > > Evolution-patches maillist - [EMAIL PROTECTED] > > http://lists.ximian.com/mailman/listinfo/evolution-patches > _______________________________________________ evolution-hackers maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution-hackers
