Indeed. Unicode separates: (1) selecting a character repertoire; (2) assigning each character a numerical character code; (3) choosing an encoding form to represent those character codes as code units (made up of bytes).
(2) and (3) are not conflated. James On Mon, Dec 2, 2019 at 9:54 AM 梁海 Liang Hai via Unicode <unicode@unicode.org> wrote: > Grrr… It’s an okayish analog for binary numbers, but not really relevant > to character encoding. Encoded characters are just assigned with integers, > which could in turn be represented in any base. > > The binary nature of computers’ way of storing numbers does not have much > to do with how character encoding works—unless you really want to start > explaining character encoding with those so basic ideas such as “What is > electricity?”, “What is a computer?”, … > > Best, > 梁海 Liang Hai > https://lianghai.github.io > > > On Dec 2, 2019, at 20:01, Costello, Roger L. via Unicode < > unicode@unicode.org> wrote: > > > > From the book titled "Computer Power and Human Reason" by Joseph > Weizenbaum, p.74-75 > > > > Suppose that the alphabet with which we wish to concern ourselves > consists of 256 distinct symbols. Imagine that we have a deck of 256 cards, > each of which has a distinct symbol of our alphabet printed on it, and, of > course, such that there corresponds one card to each symbol. How many > questions that can be answered "yes" or "no" would one have to ask, given > one card randomly selected from the deck, in order to be able to decide > which character is printed on the card? We can certainly make the decision > by asking at most 256 questions. We can somehow order the symbols and begin > by asking if it is the first in our ordering, e.g., "It is an uppercase A?" > If the answer is "no," then we ask if it is the second, and so on. But if > our ordering is known both to ourselves and to our respondent, there is a > much more economical way of organizing our questioning. We ask whether the > character we are seeking is in the first half of the set. Whatever the > answer, we will have isolated a s! > et! > > of 128 characters among the character we seek resides. We again ask > whether it is in the first half of that smaller set, and so on. Proceeding > in this way, we are bound to discover what character is printed on the > selected card by asking exactly eight questions. We could have recorded the > answers we received to our questions by writing "1" whenever the answer was > "yes" and "0" whenever it was "no." That record would then consist of eight > so-called bits each of which is either "1" or "0". This eight-bit string is > then an unambiguous representation of the character we are seeking. > Moreover, each character of the whole set has a unique eight-bit > representation within the same ordering. > > > > > -- *James Tauber* Eldarion <https://eldarion.com/> | Scaife Viewer <https://scaife-viewer.org/> | jktauber.com (Greek Linguistics) <https://jktauber.com/> | Modelling Music <https://modelling-music.com/> | Digital Tolkien <https://digitaltolkien.com/> Subscribe to my email newsletter <https://buttondown.email/jtauber>!