Re: get the sourcecode [of UTF-8]

Alexis via Unicode Tue, 05 Nov 2024 05:30:19 -0800

A bughunter via Unicode <[email protected]> writes:

Generally we put
the standard into a computer language such as C. Therefore theUnicode
V.16 standard of UTF-8 should also be the sourcecode of the
implimentation these converge making them synonymous at the
convergence.

UTF-8 is an _encoding_ of Unicode, a specification of how torepresent Unicode at the bit level. An _encoding_ is somethingdifferent from _source code_. Source code is programming languagetext that gets translated - interpreted or compiled - into machinelanguage. UTF-8 is not a programming language. It's a way ofsaying "This Unicode code point is encoded in UTF-8 with thefollowing bit pattern." If you'd like an introduction to howUnicode code points - like code point 65 for 'A' - are encoded byUTF-8, you might find this section of the relevant Wikipedia pagehelpful:


 https://en.wikipedia.org/wiki/UTF-8#Description

There is no piece of software that's the 'referenceimplementation' of UTF-8, because UTF-8 is not a specification fore.g. a software library providing certain functionality: again,UTF-8 is an algorithm for representing Unicode code points at thebit level. Programming languages provide functionality forconverting to and from UTF-8.


It's _Unicode_ that has versions; UTF-8 basically does not.


Alexis.

Re: get the sourcecode [of UTF-8]

Reply via email to