charset table and how to use them
How can I utilize c-client charset table to convert characters? I know that utf8_text() can convert the characters, but I'm having mixed results. Thanks, Shawn -- -- For information about this mailing list, and its archives, see: http://www.washington.edu/imap/c-client-list.html --
Re: charset table and how to use them
On Thu, 24 Jun 2004, Shawn Walker wrote: How can I utilize c-client charset table to convert characters? I know that utf8_text() can convert the characters, but I'm having mixed results. What, exactly, are you trying to do? utf8_text() is the routine to convert from arbitrary character sets into UTF-8 (normalized with pre-composed characters). The new utf8_cstext() routine will convert normalized pre-composed UTF-8 into most character sets (as best it can; Greek text doesn't convert well into Chinese...). To do conversion from one non-UTF-8 character set into another non-UTF-8 character set, you can use the new utf8_cstocstext() routine (I forget if this made it into imap-2004, but it's in imap-2004a). You can do things faster and with less memory if you set up the conversion tables yourself using utf8_rmap() -- Pine does this; look at the routines in strings.c and filter.c in the Pine sources. -- Mark -- http://staff.washington.edu/mrc Science does not emerge from voting, party politics, or public debate. Si vis pacem, para bellum.
Re: checking for new mail in all mailboxes
On Thu, 24 Jun 2004, David Feldman wrote: How does one (at both the IMAP command and c-client level) check for new mail efficiently in all a user's mailboxes at once? The short answer is: you can't. The longer answer is: Identify a set of mailboxes which merit further probing, and focus your checking on them. In a strictly check all mailboxes environment, do a LIST and note which mailboxes come back with \Marked status (if you're paranoid, then choose the mailboxes which don't have \Unmarked status). Then do a STATUS on each of these to check them further, or just SELECT them if the user wants them opened. Alternatively, have a discrimination between incoming mailboxes and mailboxes which are strictly archive. Don't even consider the archive mailboxes, which for most users greatly overwhelm the number of incoming mailboxes. If you are a reasonable number of incoming mailboxes, then just have all of these mailboxes SELECTed in separate IMAP sessions; this is the most efficient, best real-time, and least-costly way to monitor a set of mailboxes. Put another way; 5 IMAP sessions monitoring 5 mailboxes is less costly (often *MUCH* less costly) than repeatedly probing those 5 mailboxes in one IMAP session. Sessions are cheap, especially if you use the IDLE command. Polls are not cheap, especially with mail stores that oblige the server has to parse the enter mailbox to satisfy a STATUS. -- Mark -- http://staff.washington.edu/mrc Science does not emerge from voting, party politics, or public debate. Si vis pacem, para bellum.
Re: charset table and how to use them
On Thu, 24 Jun 2004 11:42:26 -0700 (PDT), Mark Crispin [EMAIL PROTECTED] wrote: On Thu, 24 Jun 2004, Shawn Walker wrote: To do conversion from one non-UTF-8 character set into another non-UTF-8 character set, you can use the new utf8_cstocstext() routine (I forget if this made it into imap-2004, but it's in imap-2004a). You can do things faster and with less memory if you set up the conversion tables yourself using utf8_rmap() -- Pine does this; look at the routines in strings.c and filter.c in the Pine sources. Basically convert ISO-8859-1, UTF-8, ISO-8859-15, etc characters to whatever I need in order to display the characters. Unless you are writing a text-based client for UNIX, you should convert everything into UTF-8 and use exclusively Unicode for display. Even if you are writing a text-based client for UNIX, you should still consider using Unicode (UTF-8 is just a means of representing Unicode) as newer versions of UNIX now support UTF-8. The only purpose for any other character set is to accept data in the other character set in incoming mail and files (and possibly from the user's keyboard -- although Unicode is preferred here too), and if necessary to sent mail in a non-Unicode character set (although this is doomed to deprecation). Put another way, most programs should only need utf8_text() and utf8_cstext(). Or, if you feel that you need to be able to convert ISO-8859-15 to KOI8-R or ISO-2022-JP or BIG5, you are probably doing something wrong. The program isn't running on unix. It's running on Windows with Outlook (I know, bear with me. ;) I have a string Iñtërnâtiônàlizætiøn that I need to encode before putting it in the body contents of BODY. I don't have utf8_cstocstext(), but would that function do what I need to do? I tried utf8_cstext() but, it didn't do anything (I passed UTF-8 for the charset). Thanks, Shawn