Re: A neat description of encoding characters

James Tauber via Unicode Mon, 02 Dec 2019 08:03:22 -0800

Indeed.

Unicode separates: (1) selecting a character repertoire; (2) assigning each
character a numerical character code; (3) choosing an encoding form to
represent those character codes as code units (made up of bytes).


(2) and (3) are not conflated.

James


On Mon, Dec 2, 2019 at 9:54 AM 梁海 Liang Hai via Unicode <unicode@unicode.org>
wrote:

> Grrr… It’s an okayish analog for binary numbers, but not really relevant
> to character encoding. Encoded characters are just assigned with integers,
> which could in turn be represented in any base.
>
> The binary nature of computers’ way of storing numbers does not have much
> to do with how character encoding works—unless you really want to start
> explaining character encoding with those so basic ideas such as “What is
> electricity?”, “What is a computer?”, …
>
> Best,
> 梁海 Liang Hai
> https://lianghai.github.io
>
> > On Dec 2, 2019, at 20:01, Costello, Roger L. via Unicode <
> unicode@unicode.org> wrote:
> >
> > From the book titled "Computer Power and Human Reason" by Joseph
> Weizenbaum, p.74-75
> >
> > Suppose that the alphabet with which we wish to concern ourselves
> consists of 256 distinct symbols. Imagine that we have a deck of 256 cards,
> each of which has a distinct symbol of our alphabet printed on it, and, of
> course, such that there corresponds one card to each symbol. How many
> questions that can be answered "yes" or "no" would one have to ask, given
> one card randomly selected from the deck, in order to be able to decide
> which character is printed on the card? We can certainly make the decision
> by asking at most 256 questions. We can somehow order the symbols and begin
> by asking if it is the first in our ordering, e.g., "It is an uppercase A?"
> If the answer is "no," then we ask if it is the second, and so on. But if
> our ordering is known both to ourselves and to our respondent, there is a
> much more economical way of organizing our questioning. We ask whether the
> character we are seeking is in the first half of the set. Whatever the
> answer, we will have isolated a s!
>  et!
> >  of 128 characters among the character we seek resides. We again ask
> whether it is in the first half of that smaller set, and so on. Proceeding
> in this way, we are bound to discover what character is printed on the
> selected card by asking exactly eight questions. We could have recorded the
> answers we received to our questions by writing "1" whenever the answer was
> "yes" and "0" whenever it was "no." That record would then consist of eight
> so-called bits each of which is either "1" or "0". This eight-bit string is
> then an unambiguous representation of the character we are seeking.
> Moreover, each character of the whole set has a unique eight-bit
> representation within the same ordering.
> >
>
>
>

-- 
*James Tauber*
Eldarion <https://eldarion.com/> | Scaife Viewer
<https://scaife-viewer.org/> | jktauber.com (Greek Linguistics)
<https://jktauber.com/> | Modelling Music
<https://modelling-music.com/> | Digital
Tolkien <https://digitaltolkien.com/>
Subscribe to my email newsletter <https://buttondown.email/jtauber>!

Re: A neat description of encoding characters

Reply via email to