Re: Unicode in the DOM ?

James Ross Fri, 11 Nov 2005 18:41:30 -0800

Boris Zbarsky wrote:

Emile Kroeger wrote:
I'd like to access text from the webpage in unicode format. However,
when I parse the DOM tree of the browser content
I'm not sure what you mean. Parsing is the process that turns a streamof characters into a DOM tree. You don't parse the DOM tree.
All data in the DOM in Mozilla is stored either in UTF16, UTF8, or ASCII(depending on the exact piece of data).
All data in JavaScript is in UCS-2.
So what are you doing exactly, to get something in a different encodingthan one of those? ;)

Sounds like XMLHttpRequest to me, which does some *really* dodgycharacter encoding stuff.

Emile, if you are using XMLHttpRequest, and you absolutely know thecharacter encoding the data is in (and its MIME type!), you can do thisjust after calling open():


request.overrideMimeType("mime/type; charset=utf-8");

(assuming the data is in UTF-8 - use whatever charset is appropriate)

Of course, the *best* way to fix it is to make the server send thecharacter encoding with the content...


--
James Ross <[EMAIL PROTECTED]>
ChatZilla Developer
_______________________________________________
mozilla-layout mailing list
[email protected]
http://mail.mozilla.org/listinfo/mozilla-layout

Re: Unicode in the DOM ?

Reply via email to