Hello, I'm working on a firefox extension and I'm not entirely sure this is the best place to post, but:
I'd like to access text from the webpage in unicode format. However, when I parse the DOM tree of the browser content, I get another format (and the format seems to depend on the Character Encoding I chose in the browser, but it looks like it's never given in unicode). Now, there are some XPCOM charset conversion utilities that would theoretically do the job, but those would require knowing the charset used in the page (which can probably be retrieved, but it sounds like a pain). I read that Gecko's internal representaiton is in unicode, accessing that would be neat. Does Gecko then create the "content" DOM tree in a more specific encoding for cleaner display ? Would there be a way to access the unicode representation Gecko creates ? (Please forgive my approximate terminology, I'm still very much a newbie to all this; feel free to correct me) Thank you, Emile _______________________________________________ mozilla-layout mailing list [email protected] http://mail.mozilla.org/listinfo/mozilla-layout
