On 5/26/2013 3:15 PM, David Starner wrote:
On Sun, May 26, 2013 at 12:40 PM, Andreas Stötzner <[email protected]> wrote:
One of the bodies in the world still ignorant of this fact to the very day
is Unicode. Which I feel is a mess.
Problems from Unicode generally come from of two places; compatibility
with non-Unicode data sets, and people with different goals working on
it.

Excellent insight.

However, both come with the territory of designing a "universal" character encoding.

With a mandate like that, it's difficult to leave any significant user population behind, which forces you to include both the superset of went before and to encompass people with overlapping, but partially divergent goals.

Unicode has some characteristics that emerged and took on added importance over time. These include a desire for longevity and stability, which, among other things require that characters, once admitted, must be carried along forever - and that implies that one must be leery of anything that hasn't "stood the test of time".

Characters fall out of use in the real world all the time, but the ideal for Unicode is to include primarily those that have an ongoing use in archiving and historical study, which in the digital universe might include anything used on a wide enough scale.

I sympathize with Andreas' take that the nature and development of modern pictographic writing are rather less well understood than they deserve, and that decisions about encoding are therefore done in partial ignorance of all the facts.

Solid scholarly study of the use of signs, symbols and pictographs might help - except that there seem to be no scholars that tackle these from an angle that would ultimately be useful for encoding. I don't believe that is merely a funding problem, but something more fundamental.

A./

PS: German uses the same term "wissenschaftlich" for both scientific and scholarly approaches to knowledge. There are prefixes you can use to narrow things down, but in context, they are often dropped. This, in turn, can lead to confusion because the wrong choice can be made in translation. I don't think there's a natural science of character encoding, and I don't believe that Andreas was really claiming that. Still, there are ways of rigorously studying the phenomenon, an activity that would be considered scholarship.


Reply via email to