Is the relevant part of HTML sufficient to refer to? http://dev.w3.org/html5/spec/Overview.html#utf-8
That is UTF-8 octets -> Unicode code points. UTF-16 -> UTF-8 is different. You want the algorithm in Web IDL that takes a DOMString and gives you Unicode. And then from Unicode you go to UTF-8. If you want it to never fail that is and not generate "broken" UTF-8.
-- Anne van Kesteren http://annevankesteren.nl/
