On Fri, 2002-05-03 at 00:01, Dan Winship wrote: > [moving from evolution-patches to evolution-hackers] > > Giving the user a choice could work. We can't *just* autodetect based on > the UTF8. In a string like "The character for the word 'one' is > <<U+4E00>>", the last character could be Japanese, Simplified Chinese, > or Traditional Chinese (or even Korean sometimes?).
Duh, yeah no shit. The letter 'a' can be in just about everything too. > Is there any way for the composer to know whether the user is using a > Japanese or Chinese input method? (And are there separate traditional > and simplified chinese input methods?) > > And what about cut+paste? If you paste characters from a Big5 web page, > does the composer know that or does it only get UTF8? You use utf8. > -- Dan > > On Wed, 2002-05-01 at 21:28, Not Zed wrote: > > Yes we need this code, as we needed it when it was written. > > > > If nothing else, we could potentially use it to offer the user a choice > > (as emacs does), or use it to determine if the users locale charset is a > > valid option, or even for things like autodetecting unknown data (using > > locale as a hint). > > > > The code is priority based at least. So you just order the super-meta > > charsets last, so they wont be chosen for normal text, and maybe even > > special case them based on locale so utf8 is usually preffered. > > > > On Wed, 2002-05-01 at 21:42, Dan Winship wrote: > > > > Order of preference seems to be iso-2022-jp, Shift-JIS, and then euc-jp > > > > but neither Shift-JIS nor euc-jp are liked very much. They seem to only > > > > be common in the US for example. > > > > > > > > Korean users tend to prefer euc-kr over iso-2022-kr. > > > > > > Do the character sets actually contain vastly different data? Will > > > Shift-JIS, euc-jp, or iso-2022-kr ever get chosen? > > > For that matter, will the Chinese charsets ever get autodetected or will > > > it always use the Japanese ones instead (at least for messages > > > containing only reasonably common characters)? > > > > > > Also, does this patch address the issue that a message containing both > > > Greek and Russian *can* be encoded in iso-2022, but *should* be encoded > > > in UTF8? > > > > > > What problem exactly is this supposed to be solving? If you want to > > > autodetect Asian charsets for people who aren't replying to an > > > Asian-language message and don't have an Asian locale, I don't think > > > this will work. > > > > > > Heuristics that might work are "if it contains Korean characters (which > > > are all in a certain range in Unicode), try EUC-KR", "if it contains > > > Japanese hiragana/katakana (likewise), try iso-2022-jp", and "if it > > > contains unihan characters but not kana, it's probably Chinese". I don't > > > think you can autoselect between traditional and simplified Chinese > > > charsets based on a UTF8 input stream though. > > > > > > -- Dan > > > > > > > > > _______________________________________________ > > > Evolution-patches maillist - [EMAIL PROTECTED] > > > http://lists.ximian.com/mailman/listinfo/evolution-patches > > > > > _______________________________________________ > evolution-hackers maillist - [EMAIL PROTECTED] > http://lists.ximian.com/mailman/listinfo/evolution-hackers _______________________________________________ evolution-hackers maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution-hackers
