A bughunter via Unicode <[email protected]> writes:

Generally we put
the standard into a computer language such as C. Therefore the Unicode
V.16 standard of UTF-8 should also be the sourcecode of the
implimentation these converge making them synonymous at the
convergence.

UTF-8 is an _encoding_ of Unicode, a specification of how to represent Unicode at the bit level. An _encoding_ is something different from _source code_. Source code is programming language text that gets translated - interpreted or compiled - into machine language. UTF-8 is not a programming language. It's a way of saying "This Unicode code point is encoded in UTF-8 with the following bit pattern." If you'd like an introduction to how Unicode code points - like code point 65 for 'A' - are encoded by UTF-8, you might find this section of the relevant Wikipedia page helpful:

 https://en.wikipedia.org/wiki/UTF-8#Description

There is no piece of software that's the 'reference implementation' of UTF-8, because UTF-8 is not a specification for e.g. a software library providing certain functionality: again, UTF-8 is an algorithm for representing Unicode code points at the bit level. Programming languages provide functionality for converting to and from UTF-8.

It's _Unicode_ that has versions; UTF-8 basically does not.


Alexis.

Reply via email to