Yes, this is smart, especially for its exact mapping to Base64, where it will even be superior to UTF-8 in many more cases (there should be a comparison table of sizes between UTF-8, UTF-16 and UTF-12 in the Base64 transport encoding).
You should also add, somewhere in the last section of your web document, that Base64 is not just well suited to 7-bit only environments, but as well to many 7-bit and 8-bit environments that require MIME compatibility for controls and spaces (notably in emails). After all, Base64 was first designed and standardized exactly for that purpose. All the Base64 variants, as described in: http://en.wikipedia.org/wiki/Base64#Variants summary table will also be usable in the query string appended to URLs, even though the HTML form data submitted in Base64 with the equivalent (default) GET method (or with the specified POST method) should only use one of the two variants : - 'Base64' encoding standardized for RFC 3548 or RFC 4648 (with the explicit HTML form element attributes : encoding="base64", and method="post") - Modified Base64 encoding for URL applications (with the explicit HTML form element attributes : encoding="base64url", and the default method="get") This applies to : - all URL query parameters, in a a query string that are enumerated and separated by ampersands (&), and then represented as name=value pairs or just with unnamed values (there will be no conflict with the Base64 variants that use the equal sign for padding, given that no Base64 padding is necessary when transporting UTF-12 encoded texts) - as well as the other Base64 variants for filenames, or for XML Names, or for XML NmTokens, or program identifiers. One more question : Your page is copyrighted and signed by you (with your email address as the contact) ; this is absolutely not a problem (in fact it is a good practice for all publications on the web), but there does not seem to exist any proposed licence on your page, so the only way to get one would be to contact you via your displayed email address. Can this specification page be licenced by you in an open or free way on this page, possibly dual-licenced under Creative Commons (CC-BY-SA : author's attribution required, share-alike) or LGPL (because it describes an algorithm, assimilable to library source code that will then be freely modifiable and implementable) ? -- Philippe. On 2010-06-21 at 19:00 CEST, "Andrey V. Lukyanov" <[email protected]> wrote: > As you might guess, UTF-12 is a system for representing Unicode > characters with a stream of 12-bit units. It was invented recently by > me. > > Full description is here: > > http://tapemark.narod.ru/comp/utf12en.html > > UTF-12 may be of little use in practice, but it is very nice from the > theoretical point of view.

