I'm having a bit of a problem with character set encoding in .NET, I have an
app that gets XML from a vendor with CDATA tags that can contain text for
multiple languages (Japanese, Korean, Chinese). Their xml is encoded as
ISO-8859-1. In the COM world I would just load this document with MSXML,
parse it, and display a formatted HTML page. The user could then change the
encoding on their browser to correctly render whichever language they chose.
However in .NET I have to specify an encoding when I get the http response:
StreamReader sr = new
StreamReader(httpresp.GetResponseStream(),Encoding.GetEncoding("Shift_Jis"))
;
If I use the above code the Japanese language text will display correctly
however if the document contains Korean text, the CDATA tags get hacked up
(loses closing bracket) and the .NET xml parser can't load the document. I
have tried all kinds of different encoding options on getting the
httpresponse to try and duplicate the way it behaves with COM/ASP to no
avail. What I would like to do is have the user continue to be able to
change the encoding on the browser as they see fit, however I can't make it
work because no matter which encoding I choose on getting the httpresponse
the characters don't display correctly when the encoding is changed in the
browser. Failing that, I would like the Japanese text to display correctly
by using Shift_Jis like the above line of code but being able to handle the
case where Korean munges the CDATA tag.
Any help would be greatly appreciated.
===================================
This list is hosted by DevelopMentor� http://www.develop.com
You may be interested in Guerrilla .NET, 24 March 2003, in London
http://www.develop.com/courses/gdotnet
View archives and manage your subscription(s) at http://discuss.develop.com