Yuan HOng wrote: > Unfortunately they are not completely equivalent :-( > > I have to use the encoding 'gbk' for the output, which doesn't have a > corresponding character for \xa9. Trying to convert will raise an > exception. Using encoding 'gbk' in the serialize function will > truncate the simple and all Chinese characters after '2007'.
Ok, I start to understand. The copyright symbol is not part of your character set gbk (not used that much in China it seems ;-). It is only part of latin-1 (and this unicode). But if you write © then the browser will display it as unicode anyway. So it is a way for you to use unicode characters although the document is actually not using the unicode charset. This will probably work with all modern browsern, the browser will try to switch between the fonts automatically. The problem with this is that you are exploiting a "convenience feature" of modern browsers that clashes with the implicit assumption of Kid that one output page has only one encoding. I need to think about this a little more; maybe in the next version I will add an optional feature to Kid's HTML serializer to use HTML entities for all unicode characters that are not part of the output encoding, instead of simply raising an exception. > I tried using encoding='utf8' and then convert the result to gbk. But > with 'utf8' encoding the format='named' argument doesn't seem to be > working and I got \xa0 for , which also is not convertible to > 'gbk' Yes, since if you are using utf8 as output encoding, then you don't need ©, you can simply output \xa0. Of course, you can't convert it to gbk because then. You just need to output utf8 to the browser, and all these problems disappear. That would be a workaround. Are there any reasons you are using gbk instead of utf8? (I assume there are, since I just checked that not even Google.cn is using utf8). > And another question is why are © and & handled differently? They are different in that & is ascii, while © is not, and also the former is a special charater in HTML/XML that needs to be escaped, while the latter has no special meaning in XML. -- Christoph ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kid-template-discuss mailing list kid-template-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kid-template-discuss