Kiyokazu SUTO writes:

> I don't think that SqWebMail can handle ISO-2022-JP (character
> enchoding scheme (CES) used in Japanese e-mail) even if it would
> include any mapping table between coded character sets (CCSs) in the
> scheme and Unicode. 
> 
> ISO-2022-JP is 7bit CES (i.e., it uses one or two octets in the range
> 0x21..0x7E to represnt a character), and switches 4 CCSs (US-ASCII,
> JIS X 0201 Roman, JIS C 6226, and JIS X 0208) by following escape
> sequences: 
> 
>   US-ASCII        : 0x1B 0x28 0x42
>   JIS X 0201 Roman: 0x1B 0x28 0x4A
>   JIS C 6226      : 0x1B 0x24 0x40
>   JIS X 0208      : 0x1B 0x24 0x42 
> 
> On the other hand, SqWebMail assumes that the range 0x21..0x7E is only
> used by US-ASCII, and that any Non US-ASCII character is represented
> by one or more octets in the range 0x80..0xFF.

No, not really.  SqWebMail's only assumption is that a character set can be 
mapped to or from unicode.  Non US-ASCII charsets can generally use 
0x21..0x7E, except for the HTML defanging issue, which I'll mention shortly. 

Someone else mailed me some links to look over.  It appears that the major 
stumbling block is that currently the unicode mapper does not carry over 
stateful information between successive mappings to/from unicode.  SqWebMail 
first maps the message's text/plain content to Unicode, according to its 
MIME charset, then from Unicode to the browser client's MIME charset.  To do 
this correctly with iso-2022-jp it is necessary to keep track of the current 
character set being encoded in iso-2022-jp, and currently there is no state 
information carried across successive calls to the unicode functions. 

The other potential issue is text/html content encoded in iso-2022-jp.  The 
jis-x-0208 octets are in the lower US-ASCII range and they definitely 
overlap with the HTML markup tags, since they use the < > (and & and other) 
octets.  I suppose that text/html iso-2022-jp always shifts back to US-ASCII 
before introducing each < > markup tag.  Even with that, this is going to 
cause problems for SqWebMail's HTML defanger, which eats HTML markup tags in 
their raw form. 


-- 
Sam 

Reply via email to