Re: Encoding italic

2019-01-19 Thread Richard Wordingham via Unicode
On Sun, 20 Jan 2019 03:14:21 +
James Kass via Unicode  wrote:

> (In the event that a persuasive proposal presentation prompts the 
> possibility of italics encoding...)

The use of italic script isn't just restricted to the Latin script,
which includes base characters not supported by the mathematical sets
for variables.  It isn't hard to find their sober use in Thai - I
found it in the first Thai magazine I flipped, where it was being used
for quotations and names of publication, both Thai and
English-language titles.

> Possible approaches include:
> 
> 1 - Liberating the italics from the Members Only Math Club

Doesn't help with Thai.

> 2 - Character level

Works with Thai.

> 3 - Open/Close punctuation treatment

Works with Thai.

> 4 - Leave it alone

No change.

Richard.


Re: Encoding italic

2019-01-19 Thread Richard Wordingham via Unicode
On Fri, 18 Jan 2019 10:51:18 -0500
"Mark E. Shoulson via Unicode"  wrote:

> On 1/16/19 6:23 AM, Victor Gaultney via Unicode wrote:
> >
> > Encoding 'begin italic' and 'end italic' would introduce
> > difficulties when partial strings are moved, etc. But that's no
> > different than with current punctuation. If you select the second
> > half of a string that includes an end quote character you end up
> > with a mismatched pair, with the same problems of interpretation as
> > selecting the second half of a string including an 'end italic'
> > character. Apps have to deal with it, and do, as in code editors.
> >  
> It kinda IS different.  If you paste in half a string, you get a 
> mismatched or unmatched paren or quote or something.  A typo, but a 
> transient one.  It looks bad where it is, but everything else is 
> unaffected.  It's no worse than hitting an extra key by mistake. If
> you paste in a "begin italic" and miss the "end italic", though, then
> *all* your text from that point on is affected!  (Or maybe "all until
> a newline" or some other stopgap ending, but that's just
> damage-control, not damage-prevention.)  Suddenly, letters and
> symbols five words/lines/paragraphs/pages look different, the
> pagination is all altered (by far more than merely a single extra
> punctuation mark, since italic fonts generally are narrower than
> roman).  It's a disaster.

The problem is worst when you have a small amount of italicisable text
scattered within unitalicisable text.  Unlike the case with bidi
controls, the text usually remains intelligible with some work, and
one can generally see where the missing italic should go.  However,
damage-limitation is desirable - I would suggest cancelling effects
at the end of paragraph, as with bidi controls.  On the other hand, the
corresponding stateful ISCII character settings (for font effects and
script) are ended at the end of line, which might be a finer concept.

There are several stateful control characters for Arabic, mostly
affecting numbers.  However, as far as I can see, their effect is
limited to one word (typically a string of digits).  That seems too
limited for italics, though it would be reasonable for switching
between Antiqua and black letter.

One minor problem with the stateful encoding, which seems to be in the
original spirit of ISO 10646, is that redundant instances of the
italic controls would build up in heavily edited text.  I see that
effect with ZWSP when I don't have a display mode that shows it.  One
solution would be for tricks such as "start italic" having a visible
glyph in italic mode when the contrast between italic and non-italic
mode is displayed. I don't believe italicity should be nested.
However, such a build-up is a very minor problem.

Richard.



Re: Encoding italic (was: A last missing link)

2019-01-19 Thread James Kass via Unicode



(In the event that a persuasive proposal presentation prompts the 
possibility of italics encoding...)

Possible approaches include:

1 - Liberating the italics from the Members Only Math Club
...which has been an ongoing practice since they were encoded.  It 
already works, but the set is incomplete and the (mal)practice is 
frowned upon.  Many of the older "shortcomings" of the set can now be 
overcome with combining diacritics.  These italics decompose to ASCII.


2 - Character level
Variation selectors work with today's tech.  Default ignorable property 
suggests that apps that don't want to deal with them won't.  Many see VS 
as pseudo-encoding.  Stripping VS leaves ASCII behind.


3 - Open/Close punctuation treatment
Stateful.  Works on ranges.  Not currently supported in plain-text. 
Could be supported in applications which can take a text string URL and 
make it a clickable link.  Default appearance in nonsupporting apps may 
resemble existing plain-text italic kludges such as slashes.  The ASCII 
is already in the character string.


4 - Leave it alone
This approach requires no new characters and represents the default 
condition.  ASCII.


-

Number 1 would require that anything not already covered would have to 
be eventually proposed and accepted, 2 would require no new characters 
at all, and 3 would require two control characters for starters.


As "food for thought" questions, if a persuasive case is presented for 
encoding italics, and excluding 4, which approach would have the least 
impact on the rich-text world?  Which would have the least impact on 
existing plain-text technology?  Which would be least likely to conflict 
with Unicode principles/encoding model?




Re: Encoding italic (was: A last missing link)

2019-01-19 Thread Kent Karlsson via Unicode
(I have skipped some messages in this thread, so maybe the following
has been pointed out already. Apologies for this message if so.)

You will not like this... But...

There is already a standardised, "character level" (well, it is from
a character standard, though a more modern view would be that it is
a higher level protocol) way of specifying italics (and bold, and
underline, and more):

\u001b[3mbla bla bla\u001b[0m

Terminal emulators implement some such escape sequences. The terminaI
emulators I use support bold (1 after the [) but not italic (3). Every time
you
use the "man"-command in a Linux/Unix/similar terminal you "use" the
escape sequences for bold and underline... Other terminal based programs
often use bold as well as colour esc-sequences for emphasis as well as for
warning/error messages, and other "hints" of various kinds. For xterm,
see: https://www.xfree86.org/4.8.0/ctlseqs.html.

So I don't see these esc-sequences becoming obsolete any time soon.
But I don't foresee them being supported outside of terminal emulators
either... (Though for style esc-sequences it would certainly be possible.
And a "smart" cut-and-paste operation could auto-insert an esc-sequence
that sets the the style after the paste to the one before the paste...)

Had HTML (somehow, magically) been invented before terminals, maybe
terminals (terminal emulators) would have used some kind of "mini-HTML"
instead. But things are like they are on that point.

/Kent Karlsson

PS
The cut-and-paste I used here convert (imperfectly: bold is lost and
spurious ! inserted) to HTML
(surely going through some internal attribute-based representation, the HTML
being generated
when I press send):

man(1) 
man(1)

NAME
   man - format and display the on-line manual pages

SYNOPSIS
   man  [-acdfFhkKtwW]  [--path]  [-m system] [-p string] [-C
config_file]
   [-M pathlist] [-P pager] [-B browser] [-H htmlpager] [-S
section_list]
   [section] name ...






Den 2019-01-18 20:18, skrev "Asmus Freytag via Unicode"
:

>
> 
> I would full agree and I think Mark puts it really well in the message below
> why some of the proposals brandished here are no longer plain text but
> "not-so-plain" text.
>  
> 
> I think we are better served with a solution that provides some form of
> "light" rich text, for basic emphasis in short messages. The proper way for
> this would be some form of MarkDown standard shared across vendors, and
> perhaps implemented in a way that users don't necessarily need to type
> anything special, but that, if exported to "true" plain text, it turns into
> the source format for the "light" rich text.
>  
> 
> This is an effort that's out of scope for Unicode to implement, or, I should
> say, if the Consortium were to take it on, it would be a separate technical
> standard from The Unicode Standard.
>  
>  
> 
> A./
>  
> 
> PS: I really hate the creeping expansion of pseudo-encoding via VS characters.
> The only worse thing is adding novel control functions.
>  
>  
> 
>  
>  
> On 1/18/2019 7:51 AM, Mark E. Shoulson via Unicode wrote:
>  
>  
>> On 1/16/19 6:23 AM, Victor Gaultney via Unicode wrote:
>>  
>>>  
>>>  Encoding 'begin italic' and 'end italic' would introduce difficulties when
>>> partial strings are moved, etc. But that's no different than with current
>>> punctuation. If you select the second half of a string that includes an end
>>> quote character you end up with a mismatched pair, with the same problems of
>>> interpretation as selecting the second half of a string including an 'end
>>> italic' character. Apps have to deal with it, and do, as in code editors.
>>>  
>>>  
>>  It kinda IS different.  If you paste in half a string, you get a mismatched
>> or unmatched paren or quote or something.  A typo, but a transient one.  It
>> looks bad where it is, but everything else is unaffected.  It's no worse than
>> hitting an extra key by mistake. If you paste in a "begin italic" and miss
>> the "end italic", though, then *all* your text from that point on is
>> affected!  (Or maybe "all until a newline" or some other stopgap ending, but
>> that's just damage-control, not damage-prevention.)  Suddenly, letters and
>> symbols five words/lines/paragraphs/pages look different, the pagination is
>> all altered (by far more than merely a single extra punctuation mark, since
>> italic fonts generally are narrower than roman).  It's a disaster.
>>  
>>  No.  This kind of statefulness really is beyond what Unicode is designed to
>> cope with.  Bidi controls are (almost?) the sole exception, and even they
>> cause their share of headaches.  Encoding separate _text_ italics/bold is IMO
>> also a disastrous idea, but I'm not putting out reasons for that now.  The
>> only really feasible suggestion I've heard is using a VS in some fashion.
>> (Maybe let it affect whole words instead of individual characters?  Makes for
>> fewer noisy VSs, but introduces a whole other host of limitations (how 

Re: Encoding italic (was: A last missing link)

2019-01-19 Thread James Kass via Unicode



Victor Gaultney wrote,

> If however, we say that this "does not adequately consider the harm done
> to the text-processing model that underlies Unicode", then that exposes a
> weakness in that model. That may be a weakness that we have to accept for
> a variety of reasons (technical difficulty, burden on developers, UI 
impact,

> cost, maturity).

Unicode's character encoding principles and underlying text-processing 
model remain robust.  They are the foundation of modern computer text 
processing.  The goal of 푛푒 푝푙푢푠 푢푙푡푟푎¹ needs to accommodate 
the best expectations of the end users and the fact that the consistent 
approach of the model eases the software people's burdens by ensuring 
that effective programming solutions to support one subset or range of 
characters can be applied to the other subsets of the Unicode 
repertoire.  And that those solutions can be shared with other 
developers in a standard fashion.


Assigning properties to characters gives any conformant application 
clear instructions as to what exactly is expected as the app encounters 
each character in a string.  In simpler times, the only expectation was 
that the application would splat a glyph onto a screen (and/or sheet of 
paper) and store a binary string for later retrieval.  We've moved forward.


'Unicode encodes characters, not glyphs' is a core principle. There's a 
legitimate concern whenever anyone is perceived as heading into the 
general direction of turning the character encoding into a glyph 
registry, as it suggests a possible step backwards and might lead to a 
slippery slope.  For example, if italics are encoded, why not fraktur 
and Gaelic?²


The notion that any given system can't be improved is static.³ ("System" 
refers to Unicode's repertoire and coverage rather than its core 
principles.  Core principles are rock solid by nature.)


¹ /ne plus ultra/
² "Conversely, significant differences in writing style for the same 
script may be reflected in the bibliographical classification—for 
example, Fraktur or Gaelic styles for the Latin script. Such stylistic 
distinctions are ignored in the Unicode Standard, which treats them as 
presentation styles of the Latin script."  Ken Whistler, 
http://unicode.org/reports/tr24/
³ "Static" can be interpreted as either virtually catatonic or radio 
noise.  Either is applicable here.




Re: NNBSP

2019-01-19 Thread Asmus Freytag via Unicode

  
  
On 1/19/2019 3:53 AM, James Kass via
  Unicode wrote:


  
  Marcel Schneider wrote,
  
  
  > When you ask for knowing the foundations and that knowledge
  is persistently refused,
  
  > you end up believing that those foundations just can’t be
  told.
  
  >
  
  > Note, too, that I readily ceased blaming UTC, and shifted the
  blame elsewhere, where it
  
  > actually belongs to.
  
  
  Why not think of it as a learning curve?  Early concepts and
  priorities were made from a lower position on that curve.  We can
  learn from the past and apply those lessons to the future, but a
  post-mortem seldom benefits the cadaver.
  



+1. Well put about the cadaver.


  
  Minutiae about decisions made long ago probably exist, but may be
  presently poorly indexed/organized and difficult to search/access.
  As the collection of encoding history becomes more sophisticated
  and the searching technology becomes more civilized, it may become
  easier to glean information from the archives.
  
  
  (OT - A little humor, perhaps...
  
  On the topic of Francophobia, it is true that some of us do not
  like dead generalissimos.  But most of us adore the French for
  reasons beyond Brigitte Bardot and bon-bons.  Cuisine, fries, dip,
  toast, curls, culture, kissing, and tarts, for instance.  Not to
  mention cognac and champagne!)
  
  
  

It is time for this discussion to be
  moved to a small group of people interested in hashing out
actual proposals for submission. Is there anyone here who
would like to collaborate with Marcel to find a solution for
European number formatting that
(1) fully supports the typographic best
practice
  
(2) identifies acceptable fall backs
  
(3) is compatible with existing legacy
practice, even if that does not conform to (1) or (2)
(4) includes necessary adjustments to CLDR 
  

  
If nobody here is interested in working on
that, discussing this further on this list will not serve a
useful purpose, as nothing will change in Unicode without a
well-formulated proposal that covers the four parameters laid
out here.
A./
  
  



Re: Encoding italic

2019-01-19 Thread Asmus Freytag via Unicode

  
  
On 1/19/2019 12:34 PM, James Kass via
  Unicode wrote:


  
  On 2019-01-19 6:19 PM, wjgo_10...@btinternet.com wrote:
  
  
  > It seems to me that it would be useful to have some codes
  that are
  
  > ordinary characters in some contexts yet are control codes in
  others, ...
  
  
  Italics aren't a novel concept.  The approach for encoding new
  characters is that  conventions for them exist and that people
  *are* exchanging them, people have exchanged them in the past, or
  that people demonstrably *need* to exchange them.
  
  
  Excluding emoji, any suggestion or proposal whose premise is "It
  seems to me that  it would be useful if characters supporting
  ..." is doomed to be deemed out of scope for
  the standard.
  
  
  

+1. It's the worst kind of "leading
standardization".


  



Re: Encoding italic

2019-01-19 Thread James Kass via Unicode



On 2019-01-19 6:19 PM, wjgo_10...@btinternet.com wrote:

> It seems to me that it would be useful to have some codes that are
> ordinary characters in some contexts yet are control codes in others, ...

Italics aren't a novel concept.  The approach for encoding new 
characters is that  conventions for them exist and that people *are* 
exchanging them, people have exchanged them in the past, or that people 
demonstrably *need* to exchange them.


Excluding emoji, any suggestion or proposal whose premise is "It seems 
to me that  it would be useful if characters supporting that>..." is doomed to be deemed out of scope for the standard.




Re: Encoding italic (was: A last missing link)

2019-01-19 Thread wjgo_10...@btinternet.com via Unicode

Asmus Freytag wote:

 This is an effort that's out of scope for Unicode to implement, or, I 
should say, if the Consortium were to take it on, it would be a 
separate technical standard from The Unicode Standard.


I note what you say, but what concerns me is that there seem to be an 
increasing number of matters where things are being done and neither The 
Unicode Standard nor ISO/IEC 10646 include them but they are in 
side-documents just at the Unicode website.


My understanding is that in some countries they will only use ISO/IEC 
19646 and not relate (is that the word?) to Unicode.


There are already issues over emoji ZWJ sequences that produce new 
meanings such as man ZWJ rocket producing the new meaning of astronaut 
and the 'base character plus tag characters' sequences to indicate a 
Welsh flag and a Scottish flag and if something is now done for italics 
(depending upon what it is that is done) the divergence between the two 
'groups of documents' widens even if at a precise 'definition of scope' 
meaning ISO/IEC and The Unicode Standard do not diverge.


PS: I really hate the creeping expansion of pseudo-encoding via VS 
characters.


Well, a variation sequence character is being used for requesting emoji 
display (is that a control code?), so it seems there is no lack of 
precedent to use one for italics. It seems that someone only has to say 
'out of scope' and then that is the veto for any consideration of a new 
idea for ISO/IEC 10646 or The Unicode Standard. There seems to be no way 
for a request to the committee to consider a widening of the scope to 
even be put before the committee if such a request is from someone 
outside the inner circle.



The only worse thing is adding novel control functions.


For example? Would you be including things like changing the colour of 
the jacket that an emojiperson is wearing?


It seems to me that it would be useful to have some codes that are 
ordinary characters in some contexts yet are control codes in others, 
for example for drawing simple line graphic diagrams within a document, 
such that they are just ordinary characters in a text document but, say, 
draw an image when included within a PDF (Portable Text Format) 
document. Their use would be optional so that people who did not want to 
use them could just ignore them and applications that did not use them 
as control codes could just display a glyph for each character. Yet 
there could be great possibilities for them if the chance to get them 
into ISO/IEC 10646 and The Unicode Standard were possible.


William Overington
Saturday 19 January 2019


William Over


Re: NNBSP

2019-01-19 Thread James Kass via Unicode



Marcel Schneider wrote,

> When you ask for knowing the foundations and that knowledge is 
persistently refused,

> you end up believing that those foundations just can’t be told.
>
> Note, too, that I readily ceased blaming UTC, and shifted the blame 
elsewhere, where it

> actually belongs to.

Why not think of it as a learning curve?  Early concepts and priorities 
were made from a lower position on that curve.  We can learn from the 
past and apply those lessons to the future, but a post-mortem seldom 
benefits the cadaver.


Minutiae about decisions made long ago probably exist, but may be 
presently poorly indexed/organized and difficult to search/access. As 
the collection of encoding history becomes more sophisticated and the 
searching technology becomes more civilized, it may become easier to 
glean information from the archives.


(OT - A little humor, perhaps...
On the topic of Francophobia, it is true that some of us do not like 
dead generalissimos.  But most of us adore the French for reasons beyond 
Brigitte Bardot and bon-bons.  Cuisine, fries, dip, toast, curls, 
culture, kissing, and tarts, for instance.  Not to mention cognac and 
champagne!)




Re: NNBSP

2019-01-19 Thread Marcel Schneider via Unicode

On 19/01/2019 09:42, Asmus Freytag via Unicode wrote:

[…]

For one, many worthwhile additions / changes to Unicode depend on getting written up in 
proposal form and then championed by dedicated people willing to see through the process. 
Usually, Unicode has so many proposals to pick from that at each point there are more 
than can be immediately accommodated. There's no automatic response to even issues that 
are "known" to many people.

"Demands" don't mean a thing, formal proposals, presented and then refined 
based on feedback from the committee is what puts issues on the track of being resolved.


That is also what I suspected, that the French were not eager enough to get 
French supported, as opposed to the Vietnamese who lobbied long before the era 
of proposals and UTC meetings.

Please,/where can we find the proposals for FIGURE SPACE to become 
non-breakable, and for PUNCTUATION SPACE to stay or become breakable?/

(That is not a rhetoric question. The ideal answer is a URL.
Also, that is not about pre-Unicode documentation, but about the action that 
Unicode took in that era.)


[…]

Yes, I definitely used an IBM Selectric for many years with interchangeable type wheels, 
but I don't remember using proportional spacing, although I've seen it in the kinds of 
"typescript" books I mentioned. Some had that crude approximation of 
typesetting.


Thanks for reporting.


When Unicode came out, that was no longer the state of the art as TeX and laser 
printers weren't limited that way.

However, the character sets from which Unicode was assembled (or which it had 
to match, effectively) were designed earlier - during those times. And we 
inherited some things (that needed to be supported so round-trip mapping of 
data was possible) but that weren't as well documented in their particulars.

I'm sure we'll eventually deprecate some and clean up others, like the 
Mongolian encoding (which also included some stuff that was encoded with an 
understanding that turned out less solid in retrospect than we had thought at 
the time).

Something the UTC tries very hard to avoid, but nobody is perfect. It's best 
therefore to try not to ascribe non-technical motives to any action or inaction 
of the UTC. What outsiders see is rarely what actually went down,


That is because the meeting minutes would gain in being more explicit.


and the real reasons for things tend to be much less interesting from an 
interpersonal  or intercultural perspective.


I don’t care about “interesting” reasons. I’d just appreciate to know the truth.


So best avoid that kind of topic altogether and never use it as basis for 
unfounded recriminations.


When you ask for knowing the foundations and that knowledge is persistently 
refused, you end up believing that those foundations just can’t be told.

Note, too, that I readily ceased blaming UTC, and shifted the blame elsewhere, 
where it actually belongs to. I’d kindly request not to be considered a 
hypocrite that in reality keeps blaming the UTC.


A./





Re: NNBSP

2019-01-19 Thread Marcel Schneider via Unicode

On 19/01/2019 01:21, Shawn Steele wrote:


*>> *If they are obsolete apps, they don’t use CLDR / ICU, as these are 
designed for up-to-date and fully localized apps. So one hassle is off the table.

Windows uses CLDR/ICU.  Obsolete apps run on Windows.  That statement is a 
little narrowminded.

>> I didn’t look into these date interchanges but I suspect they won’t use any 
thousands separator at all to interchange data.

Nope

>> The group separator is only for display and print

Yup, and people do the wrong thing so often that I even blogged about it. 
https://blogs.msdn.microsoft.com/shawnste/2005/04/05/culture-data-shouldnt-be-considered-stable-except-for-invariant/


Thanks for sharing. As it happens, I like most the first reason you provide:

 * “The most obvious reason is that there is a bug in the data and we had to 
make a change. (Believe it or not we make mistakes ;-))  In this case our users 
(and yours too) want culturally correct data, so we have to fix the bug even if 
it breaks existing applications.”


No comment :)


>> Sorry you did skip this one:

Oops, I did mean to respond to that one and accidentally skipped it.


No problem.


>> What are all these expected to do while localized with scripts outside 
Windows code pages?

(We call those “unicode-only” locales FWIW)


Noted.


The users that are not supported by legacy apps can’t use those apps 
(obviously).  And folks are strongly encouraged to write apps (and protocols) 
that Use Unicode (I’ve blogged about that too).



Like here:
https://blogs.msdn.microsoft.com/shawnste/2009/06/01/writing-fields-of-data-to-an-encoded-file/

You’re showcasing that despite “The moral here is ‘Use Unicode’ ” some people 
are still not using it. The stuff gets even weirder as you state that code 
pages and Unicode are not 1:1, contradicting the Unicode design principle of 
roundtrip compatibility.

The point in not using Unicode, and likewise in not using verbose formats, is 
limited hardware resources. Often new implementations are built on top of old 
machines and programs, for example in the energy and shipping industies. This 
poses a security threat, ending up in power outages and logistic breakdowns. 
That is making our democracies vulnerable. Hence maintaining obsolete systems 
does not pay back. We’re all better off when recycling all the old hardware and 
investing in latest technologies, implementing Unicode by the way.

What you are advocating in this thread seems like a non-starter.


However, the fact that an app may run very poorly in Cherokee or whatever 
doesn’t mean that there aren’t a bunch of French enterprises that depend on 
that app for their day-to-day business.


They’re ill-advised in doing so (see above).


In order for the “unicode-only” locale users to use those apps, the app would 
need to be updated, or another app with the appropriate functionality would 
need to be selected.


To be “selected”, not developed and built. The job is already done. What are 
people waiting for?


However, that still doesn’t impact the current French users that are “ok” with 
their current non-Unicode app.  Yes, I would encourage them to move to Unicode, 
however they tend to not want to invest in migration when they don’t see an 
urgent need.


They may not see it because they’re lacking appropriate training in cyber 
security. You seem to be backing that unresponsive behavior. I can’t see that 
you may be doing any good by doing so, and I’d strongly advise you to reach out 
to your customers, or check the issue with your managers. We’re in a time where 
companies are still making huge benefits, and it is unclear where all that 
money goes once paid out to shareholders. The money is there, you only need to 
market the security. That job would better use your time than tampering with 
legacy apps.


Since Windows depends on CLDR and ICU data, updates to that data means that 
those customers can experience pain when trying to upgrade to newer versions of 
Windows.  We get those support calls, they don’t tend to pester CLDR.


Am I pestering CLDR…

Keeping CLDR in synch is just the right way to go.

Since we’re on it: Do you have any hints about why some powerful UTC members 
seem to hate NNBSP in French?
I’m mainly talking about French punctuation spacing here.


Which is why I suggested an “opt-in” alt form that apps wanting “civilized” 
behavior could opt-into (at least for long enough that enough badly behaved 
apps would be updated to warrant moving that to the default.)



Asmus Freytag’s proposal seems better:

   “having information on "common fallbacks" would be useful. If formatting 
numbers, I may be free to pick the "best",
   but when parsing for numbers I may want to know what deviations from 
"best" practice I can expect.”


Because if you let your customers “opt in” instead of urging them to update, 
some will never opt in, given they’re not even ready to care about cyber 
security.


The data for 

Re: NNBSP

2019-01-19 Thread Asmus Freytag via Unicode

  
  
On 1/18/2019 11:34 PM, Marcel Schneider
  via Unicode wrote:


  
Current
  practice in electronic publishing was to use a non-breakable
  thin space, Philippe Verdy reports. Did that information come
  in somehow?

==> probably not in the early days. Y

  
  Perhaps it was ignored from the beginning on, like Philippe Verdy
  reports that UTC ignored later demands, getting users upset. 

==> for reasons given in another post, I tend to not give much
  credit to these suggestions. 

For one, many worthwhile additions / changes to Unicode depend on
  getting written up in proposal form and then championed by
  dedicated people willing to see through the process. Usually,
  Unicode has so many proposals to pick from that at each point
  there are more than can be immediately accommodated. There's no
  automatic response to even issues that are "known" to many people.
"Demands" don't mean a thing, formal proposals, presented and
  then refined based on feedback from the committee is what puts
  issues on the track of being resolved.

 That
  leaves us with the question why it did so, downstream your
  statement that it was not what I ended up suspecting.
  
  Does "Y" stand for the peace symbol?

==> No, my thumb sometimes touches the touchpad and flicks the
  cursor while I type. I don't always see where some character end
  up. Or, I start a sentence and the phone rings. Or any of a number
  of scenarios. Take your pick.

  
 
 
  ISO 31-0 was published in 1992, perhaps too late for Unicode.
  It is normally understood that the thousands separator should
  not have the width of a digit. The allaged reason is security.
  Though on a typewriter, as you state, there is scarcely any
  other option. By that time, all computerized text was fixed
  width, Philippe Verdy reports. On-screen, I figure out, not in
  book print

==> much book printing was also done by photomechanically
  reproducing typescript at that time. Not everybody wanted to
  pay typesetters and digital typesetting wasn't as advanced. I
  actually did use a digital phototypesetter of the period a few
  years before I joined Unicode, so I know. It was more powerful
  than a typewriter, but not as powerful as TeX or later the
  Adobe products.
For one, you didn't typeset a page, only a column of text,
  and it required manual paste-up etc.

  
  Did you also see typewriters with proportional advance width (and
  interchangeable type wheels)? That was the high end on the
  typewriter market. (Already mentioned these typewriters in a
  previous e‑mail.) Books typeset this way could use bold and (less
  easy) italic spans.
Yes, I definitely used an IBM Selectric for many years with
  interchangeable type wheels, but I don't remember using
  proportional spacing, although I've seen it in the kinds of
  "typescript" books I mentioned. Some had that crude approximation
  of typesetting.
When Unicode came out, that was no longer the state of the art as
  TeX and laser printers weren't limited that way.
However, the character sets from which Unicode was assembled (or
  which it had to match, effectively) were designed earlier - during
  those times. And we inherited some things (that needed to be
  supported so round-trip mapping of data was possible) but that
  weren't as well documented in their particulars.
I'm sure we'll eventually deprecate some and clean up others,
  like the Mongolian encoding (which also included some stuff that
  was encoded with an understanding that turned out less solid in
  retrospect than we had thought at the time).
Something the UTC tries very hard to avoid, but nobody is
  perfect. It's best therefore to try not to ascribe non-technical
  motives to any action or inaction of the UTC. What outsiders see
  is rarely what actually went down, and the real reasons for things
  tend to be much less interesting from an interpersonal  or
  intercultural perspective. So best avoid that kind of topic
  altogether and never use it as basis for unfounded recriminations.
A./