Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-20 Thread Philippe Verdy
2011/7/18 Asmus Freytag asm...@ix.netcom.com:
 On 7/17/2011 12:19 PM, Philippe Verdy wrote:

 Another alternative: instead of encoding separate symbols for each
 control, we could as well encode symbols for each character visible in
 those symbols.

 I'm baffled: what problem is this elaborate scheme trying to solve?

It's not elaborate. It is extremely simple in fact. I don't propose
to encode new symbols. I only propose to encode the decorations
themselves, separately. We currently have such encoded characters for
enclosing box decorations, but only capable of enclosing a single
character. They are encoded as gc=Me, i.e. as combining diacritics.

 The problem was never in *how* to encode such symbols, but in *whether* they
 should be considered *characters* (and therefore need to be supported on the
 character level of the architecture). That point, whether there's a
 reasonable use case for them as characters, has not been settled, so the
 case for thinking about encoding solutions has not been established.

 When people write about a line feed character, they use LF or linefeed
 or 000A (or U+000A or 0x0A etc.). They commonly don't use the LF symbol
 character, nor any other unencoded symbol.

Yes, but they also cite them using a symbol where needed.

I have NEVER said that they used an unencoded symbol. They use the
symbol without even knowing if it's encoded or not, and don't care
about that !

 I claim, the same is true for ZWJ, RLO, PDF and all the other good
 characters.

 Just because Unicode uses dashed box placeholders in the code charts hasn't
 made them the generally accepted, universally understood *symbols* for these
 characters.

 This is different from the pictures for control codes because at the time,
 these were widely supported in devices, and users of these devices
 (terminals) were familiar with the convention (staggered small letters) and
 many would recognize common control characters.

 So, let's keep a lid on devising ever more arcane and fragile encoding and
 pseudo-encoding options until there's consensus that this issue must be
 addressed on the character level.

I did not speak about pseudo encoding. I evoked it as a possible way
to represent a string that will visually, and logically, represent a
visual abbreviation like LF decorated with a dotted box, as an
alternative to encoding specific symbols, given the current desire of
not encoding those symbols directly.

I evoked the alternatives because it avoids the other issues
introduced in other proposals posted to the list : notably trying to
use the control character itself with some other control, in order to
escape it (I read things like using variation characters): this is
really the worst, and those other proposals are MUCH WORSE than what I
said, and are really pseudo-encoding.

It remains that there's already a demonstrated use of such decorating
boxes, not just for control characters of Unicode, but for a more
general use. You'll note that Microsoft Word already contains such
generic feature for inserting arbitrary characters in enclosing boxes
(or other graphic symbols).

Yes, of course (I have also stated that it was effectiely text
decoration, and CSS or other rich text features can already do that),
encoding those symbols directly remains an open question (Michael
Everson admits that).

My proposals are completely in-line with the other possible practices
of citing the character by their name, or abbreviation or code. But
this does not precluse the need to represent it in a more compact way,
directly within runs of surrounding texts, in such a way that it will
be visually distinct from those surrounding texts.

These proposals were a reply to the other pseudo-encoding proposals
that clearly broke the encoding model (using variation selectors or
the like, with the control character encoded directly). And they are
clearly made in order to counter the intent of encoding of new symbols
for those controls, whatever their numbers: users that want to see
those symbols made of an abbreviation and a decoration box around,
will see that. Their intent is clearly to use the stated
abbreviations, but make them more visually emphasized.

But Yes, I have already said that if you have a rich-text environment,
it is certainly best ot use the rich-text features to specifiy these
decorations (including font size and margin adjustments). They will
then use, at the plain-text level the abbreviation only, and not any
special symbol that are not really needed. So in those cases, my
proposals are not even needed, and NO other specific encoding of those
symbols are needed as well in this context of rich-text documents.

My proposals are then only made for plain-text only, where you
currently have no other solution than citing the undecorated
abbreviations (which may be ambiguous in some cases), or surrounding
them with punctuations (supposed to make this clear enough).

Citing the name or code points of those controls encoded in the

RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-19 Thread Peter Constable
As always, we want to know that there's a real use case for encoding. Doing 
things on spec, especially in a case like this, is IMO not at all a good idea.

Peter

-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Michael Everson
Sent: Friday, July 15, 2011 5:41 AM
To: unicode Unicode Discussion
Subject: Re: Quick survey of Apple symbol fonts (in context of the 
Wingding/Webding proposal)

On 15 Jul 2011, at 13:36, Martin J. Dürst wrote:

 If we take the needs of charaacter encoding experts when they write *about* 
 characters to decide what to make a character, then we get many too many 
 characters encoded. 


I think that having encoded symbols for control characters (which we already 
have for some of them) is no bad thing, and the argument about too many 
characters is not compelling, as there are only some dozens of these 
characters encoded, not thousands and thousands or anything. 

Michael Everson * http://www.evertype.com/








RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-19 Thread Peter Constable
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Karl Pentzlin
Sent: Friday, July 15, 2011 9:46 AM

AW I oppose encoding graphic clones of non-graphic characters ...

 I am just waiting for the killer argument against the encoding of chart 
 symbols.

For me the killer argument is that the large costs that would be incurred by 
committees and implementers of the standards are nowhere near made up for in 
real-use need. Just because _some_ characters that give visual presentation for 
otherwise invisible entities have been encoded does not at all imply that 
character should be encoded in every such case. These are not needed for common 
data interchange; the need to display such presentations is limited only to 
production of the standards and perhaps in certain special-use software user 
interfaces. IMO, those can be left as exceptions. In contrast, nobody -- not 
even in the context of this discussion list -- needs to be able (e.g.) to send 
an email that contains in plain text a character that depicts in a visible 
manner a character like NBSP or CGJ.



Peter




RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-19 Thread Peter Constable
So you want to be able to discuss NBSP (say) in plain text. You can already do 
that; in fact, you have multiple ways that everybody here will have no 
difficulty understanding:

NBSP
no-break space
U+00A0

Creating a different character for SYMBOL FOR NBSP doesn't make communication 
here any easier; in fact, it would lead to confusion as to whether you are, in 
fact, meaning to refer to NBSP or to SYMBOL FOR NBSP.



Peter

-Original Message-
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Michael Everson
Sent: Friday, July 15, 2011 10:26 AM
To: unicode Unicode Discussion
Subject: Re: Quick survey of Apple symbol fonts (in context of the 
Wingding/Webding proposal)

What I see is a certain unreasonability reflecting a certain conservatism. Text 
about the Standard is important, and should be representable in an 
interchangeable way. Here { } is a Right to left override character. character. 
I want to talk about it in a way that is visible. Oops. I can't do it 
interchangeably. 

Michael Everson * http://www.evertype.com/








RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-19 Thread Peter Constable
From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On Behalf 
Of Asmus Freytag
Sent: Sunday, July 17, 2011 6:34 PM
...

 Another alternative: instead of encoding separate symbols for each 
 control, we could as well encode symbols for each character visible in 
 those symbols.
...

I'm baffled: what problem is this elaborate scheme trying to solve?

I completely agree: you're creating complex mechanisms to solve problems for 
which need has not been established.



Peter





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-19 Thread John W Kennedy

On Jul 19, 2011, at 9:20 PM, Peter Constable wrote:

 So you want to be able to discuss NBSP (say) in plain text. You can already 
 do that; in fact, you have multiple ways that everybody here will have no 
 difficulty understanding:
 
 NBSP
 no-break space
 U+00A0
 
 Creating a different character for SYMBOL FOR NBSP doesn't make communication 
 here any easier; in fact, it would lead to confusion as to whether you are, 
 in fact, meaning to refer to NBSP or to SYMBOL FOR NBSP.

But it's futile to argue that. People in the real world have been using such 
conventions going back at least to the early 1960s, and, the last I heard, 
Unicode is supposed to be used to encode the characters that people use.

-- 
John W Kennedy
Read the remains of Shakespeare's lost play, now annotated!
http://www.SKenSoftware.com/Double%20Falshood








Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-18 Thread Christopher Fynn
On 15/07/2011, Karl Pentzlin karl-pentz...@acssoft.de wrote:

 In WG2 N4085 Further proposed additions to ISO/IEC 10646 and comments to 
 other proposals (2011‐ 05‐25), the German NB had requested re WG2 N4022 
 Proposal to add Wingdings and Webdings Symbols besides other points:
   Also, in doing this work, other fonts widespread on the computers of 
 leading manufacturers (e.g.  Apple) shall be included, thus avoiding the 
 impression that Unicode or SC2/WG2 favor a single  manufacturer.

In regard to getting their standard symbol / dingbats fonts encoded,
isn't Apple way ahead of Microsoft? Didn't the original dingbats
symbols in Unicode get encoded mostly because the ITC Zapf Dingbats
font was built into the Apple Laserwriter?




RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-18 Thread Erkki I Kolehmainen
After a large number of messages in this thread, I'm yet to see any reason
to support this particular example.

Erkki

-Alkuperäinen viesti-
Lähettäjä: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org]
Puolesta Michael Everson
Lähetetty: 15. heinäkuuta 2011 20:26
Vastaanottaja: unicode Unicode Discussion
Aihe: Re: Quick survey of Apple symbol fonts (in context of the
Wingding/Webding proposal)

What I see is a certain unreasonability reflecting a certain conservatism.
Text about the Standard is important, and should be representable in an
interchangeable way. Here { } is a Right to left override character.
character. I want to talk about it in a way that is visible. Oops. I can't
do it interchangeably. 

Michael Everson * http://www.evertype.com/







Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-17 Thread Julian Bradfield
On 2011-07-15, Michael Everson ever...@evertype.com wrote:
 On 15 Jul 2011, at 17:03, Doug Ewell wrote:

 1. Graphic symbols for control characters are needed so writers can write 
 about the control characters themselves using plain text.

 This does not seem so unreasonable. The RTL and LTR overrides
 *function* on the text when inserted into text. So you can't use
 those with glyphs in a font to represent for example the UCS
 dotted-boxes-with-letters, because they are control characters and
 will affect the text.  

Wouldn't it be more economical to encode a single UNICODE ESCAPE
CHARACTER which forces the following character to be interpreted as a
printable glyph rather than any control function?

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-17 Thread Michael Everson
On 17 Jul 2011, at 10:14, Julian Bradfield wrote:

 The RTL and LTR overrides *function* on the text when inserted into text. So 
 you can't use those with glyphs in a font to represent for example the UCS 
 dotted-boxes-with-letters, because they are control characters and will 
 affect the text.  
 
 Wouldn't it be more economical to encode a single UNICODE ESCAPE CHARACTER 
 which forces the following character to be interpreted as a printable glyph 
 rather than any control function?

I think that invisible and stateful control characters are more expensive than 
ordinary graphic symbols.

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-17 Thread Petr Tomasek
On Sun, Jul 17, 2011 at 10:14:55AM +0100, Julian Bradfield wrote:
 On 2011-07-15, Michael Everson ever...@evertype.com wrote:
  On 15 Jul 2011, at 17:03, Doug Ewell wrote:
 
  1. Graphic symbols for control characters are needed so writers can write 
  about the control characters themselves using plain text.
 
  This does not seem so unreasonable. The RTL and LTR overrides
  *function* on the text when inserted into text. So you can't use
  those with glyphs in a font to represent for example the UCS
  dotted-boxes-with-letters, because they are control characters and
  will affect the text.  
 
 Wouldn't it be more economical to encode a single UNICODE ESCAPE
 CHARACTER which forces the following character to be interpreted as a
 printable glyph rather than any control function?

I already thought about this but this would probably mean that
algorithms (like the Unicode BiDi Algorithm) would have to be changed.

-- 
Petr Tomasek http://www.etf.cuni.cz/~tomasek
Jabber: but...@jabbim.cz


EA 355:001  DU DU DU DU
EA 355:002  TU TU TU TU
EA 355:003  NU NU NU NU NU NU NU
EA 355:004  NA NA NA NA NA






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-17 Thread Asmus Freytag

On 7/17/2011 2:47 AM, Petr Tomasek wrote:

On Sun, Jul 17, 2011 at 10:14:55AM +0100, Julian Bradfield wrote:


Wouldn't it be more economical to encode a single UNICODE ESCAPE
CHARACTER which forces the following character to be interpreted as a
printable glyph rather than any control function?

I already thought about this but this would probably mean that
algorithms (like the Unicode BiDi Algorithm) would have to be changed.



Change that to: it would mean that ALL algorithms that interpret any of 
the invisible characters would have to change.


The reason is, of course, because these codes would *reinterpret* 
existing characters. You could argue that Variation Selectors do the 
same, but they are carefully constructed so that they can be safely 
ignored. These suggested character couldn't be safely ignored, because 
doing so would have control/formatting codes in the middle of text where 
none were intended.


Michael has it right:

On 7/17/2011 2:35 AM, Michael Everson wrote:


... invisible and stateful control characters are more expensive than ordinary 
graphic symbols.


In this case, the expense is so much higher as to rule out such an idea 
from the start.


A./

PS: this doesn't mean that adding graphic symbols is the foregone thing 
to do, only that, if evidence points to the need to address this issue 
in character encoding, then, using graphic symbols is the better way to 
go about it.




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-17 Thread Asmus Freytag

On 7/17/2011 12:19 PM, Philippe Verdy wrote:

2011/7/17 Asmus Freytagasm...@ix.netcom.com:

On 7/17/2011 2:35 AM, Michael Everson wrote:

... invisible and stateful control characters are more expensive than
ordinary graphic symbols.

In this case, the expense is so much higher as to rule out such an idea from
the start.

A./

PS: this doesn't mean that adding graphic symbols is the foregone thing to
do, only that, if evidence points to the need to address this issue in
character encoding, then, using graphic symbols is the better way to go
about it.

Another alternative: instead of encoding separate symbols for each
control, we could as well encode symbols for each character visible in
those symbols.

E.g. ro represent the glyph for the RLO control, we could encode three
characters, one for each of R, L, and O, as DOTTED SYMBOL FOR LATIN
CAPITAL LETTTER R, DOTTED SYMBOL FOR LATIN CAPITAL LETTER L, DOTTED
SYMBOL FOR LATIN CAPITAL LETTER O. These three symbols would have a
representative glyph as the base letter from which they are derived,
within a dotted rectangle.

Then each of them would contextually adopt one of four glyph forms :
the full rectangle, or the rectangle with the left or right side
removed, or both sides removed. The selection would be performed
selectively.


I'm baffled: what problem is this elaborate scheme trying to solve?

The problem was never in *how* to encode such symbols, but in *whether* 
they should be considered *characters* (and therefore need to be 
supported on the character level of the architecture). That point, 
whether there's a reasonable use case for them as characters, has not 
been settled, so the case for thinking about encoding solutions has not 
been established.


When people write about a line feed character, they use LF or 
linefeed or 000A (or U+000A or 0x0A etc.). They commonly don't use the 
LF symbol character, nor any other unencoded symbol.


I claim, the same is true for ZWJ, RLO, PDF and all the other good 
characters.


Just because Unicode uses dashed box placeholders in the code charts 
hasn't made them the generally accepted, universally understood 
*symbols* for these characters.


This is different from the pictures for control codes because at the 
time, these were widely supported in devices, and users of these devices 
(terminals) were familiar with the convention (staggered small letters) 
and many would recognize common control characters.


So, let's keep a lid on devising ever more arcane and fragile encoding 
and pseudo-encoding options until there's consensus that this issue must 
be addressed on the character level.


A./



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Julian Bradfield
The record mark (IBM GCGID SS95) consists of two horizontal lines 
crossed
by one vertical line.

The segment mark (IBM GCGID SS96) consists of three vertical lines 
crossed
by one horizontal line.

The group mark (IBM GCGID SS97) consists of three horizontal lines 
crossed
by one vertical line.

Ah, thanks - my fault, it didn't occur to me that I should look
between R and S to find a record mark, instead of at the beginning
with the other weird stuff! What a system...

The other two could be proposed as unitary symbols, if anybody really 
needs to
represent them. They are commensurate with a large number of similar symbols
consisting of various numbers of horizontal lines crossed by various numbers
of vertical lines. See, e.g., 29FA, 29FB, 2A68, 2A69, 2AF2, 2AF5.

They could, but wouldn't the same principle that bans new precomposed
accented characters applies? If not, why not?

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Michael Everson
On 16 Jul 2011, at 09:08, Julian Bradfield wrote:

 The other two could be proposed as unitary symbols, if anybody really needs 
 to represent them. They are commensurate with a large number of similar 
 symbols consisting of various numbers of horizontal lines crossed by various 
 numbers of vertical lines. See, e.g., 29FA, 29FB, 2A68, 2A69, 2AF2, 2AF5.
 
 They could, but wouldn't the same principle that bans new precomposed 
 accented characters applies? If not, why not?

I think the ban would apply only if it were suggested that there be a canonical 
decomposition for the characters encoded. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Michael Everson
On 16 Jul 2011, at 04:37, Asmus Freytag wrote:

 It's not a matter of competing views. There's a well-defined process for 
 adding characters to the standard. It starts by documenting usage.

Yes, Asmus, and when one wants to do that, one writes a proposal. We aren't 
writing a proposal here. We're *talking* about things. 

 If you can document such usage, and if it is widespread, and settled enough 
 to warrant standardization to support it, then a proposal based on such 
 documentation is something that should be reviewed according to the 
 established process.
 
 I don't really need to tell you this, as you are quite familiar with how the 
 process works.

No, you really don't. :-/

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Karl Pentzlin
Am Freitag, 15. Juli 2011 um 19:48 schrieb Asmus Freytag:

AF The document registry should be limited to documents that can and should
AF be reviewed in committee.

WG2 N4127 is, by its content and the reference in its introduction,
an appendix to the German NB requests expressed in WG2 N4085. It provides
detailed information which had been included in WG2 N4085 itself if
anybody had the time to prepare this in time before the Helsinki
meeting. As such, it is placed there where WG2 N4085 is.

Also, it can be reviewed. However, as it is not a character encoding
proposal, the items to be reviewed are not the single characters.
The question addressed by the time of writing WG2 N4085 is:
Is it appropriate to encode characters based on the widespreadness of
 a speific set of fonts only - and if yes, which fonts are to be included?

In the meantime, the first part of this question was answered in
Helsinki with Yes.
This reduces the German NB request in WG2 N4085 that other
manufacturers have to be treated equal like Microsoft. To see what
this request really means, it is convenient what such fonts of other
OS manufacturers really contain. To show this for one of them (Apple in
this case), is the *ONLY* intent of WG2 N4127.

To work on a character proposal, the main question has to be answered
first (and this question is the only thing subject to a formal review
at this time) is:
Shall the Apple symbols be encoded on the same base as the Wingding/Webding
symbols, i.e. based on the widespreadness of the font only?
Of course, such a decision has take into account any statement from Apple
themselves, and should not me made before such is there.

Also, before such a decision is taken, it is not appropriate to start
any work on an encoding proposal. (If someone finds specimens in the
list for other proposals which go the usual way, taking their evidence
from plain text use elsewhere [like e.g. chart symbols], this is
completely independent.)

As John Jenkins correctly pointed out, it is too early for a character
encoding proposal.
Thus, I did *deliberately* not start any work which is appropriate for
a character proposal, starting with sorting out the already encoded
characters or the not encodable logos.

Also, it seems that I have to emphasize that I did not intend to start
a new encoding project. The subject of my mail deliberately states
(in context of the Wingding/Webding proposal).

- Karl




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Philippe Verdy
Why could'nt we have dotted square brackets encoded, allowing then
fonts to contain ligatures to generate the symbols, possibly with help
of ligature hinting (using joiner controls), to enclose the characters
in-between ?

E.g. [RLO], where the [, ] would be the encoded dotted square
brackets. The full sequence could be hinted to generate the ligature
as
ENCLOSING OPEN DOTTED SQUARE BRACKET, ZWJ, LATIN CAPITAL LETTER R,
ZWJ, LATIN CAPITAL LETTER L, ZWJ, LATIN CAPITAL LETTER O, ZWJ,
ENCLOSING CLOSE DOTTED SQUARE BRACKET

No need then to encode all controls as pairs. Fonts can still be
produced that will recognize and display the ligatures to generate the
necessary pictures, from regular character.

There are other similar features, for simple enclosures : in
[rectangles], /triangles\ or \triangles/ or triangles] or [triangles
(circles/ovals), [|key|], {hexagons} or =hexagons=... (here I just
used ASCII art to approximate them); with simple or double or thick
enclosures... Example of use for creating cartouches.

Even if the fonts are not recognizing these ligatures themselves, a
text renderer may still automatically generate them using text
decoration of the bracketed text encoded within, or using the enclosed
character already encoded in the UCS.

And there are possible display fallbacks: a font for example can still
map these characters are symbols, overriding outside their left or
right side bearings, so that, when rendered with nothing between them,
their strokes will overstrike natively to make the enclosure (as these
characters would be joining).

No need to specify specific metrics (it should not matter here if they
display a true square area or a recangular area). Ligatures may still
be produced by the renderer or in the font itself, to make the
enclosed text (e.g. up to 4 characters) fit within it automatically,
if trying to avoid too long rectangular areas or trying to fit a
square.

They should not be encoded as format characters, but as punctuation
marks, similar to brackets and parentheses, with mirroring supported
for RTL scripts, to allow these reasonnable display fallbacks
explicitly.

-- Philippe.

2011/7/15 Martin J. Dürst due...@it.aoyama.ac.jp:


 On 2011/07/15 18:51, Michael Everson wrote:

 On 15 Jul 2011, at 09:47, Andrew West wrote:

 If you want a font to display a visible glyph for a format or space
 character then you should just map the glyph to its character in the font,
 as many fonts already do for certain format characters.

 Sometimes I might want to show a dotted box for NBSP and sometimes a real
 NBSP. Or many other characters. Or show a RTL and LTR override character
 without actually overriding the text. You'd need a picture for that, because
 just putting in a glyph for it would also override the text.

 I understand the need. But then what happens is that we need a picture in
 the standard for the character that depicts an RLO (but isn't actually
 one). And then you need another character to show that picture, and so on
 ad infinitum. This doesn't scale.

 If we take the needs of charaacter encoding experts when they write *about*
 characters to decide what to make a character, then we get many too many
 characters encoded. That's similar to the need of typographers when they
 talk about different character shapes. If we had encoded a Roman 'a' and an
 Italic 'a' separately just because the distinction shows up explicitly in
 some texts on typography, that would have been a mistake (the separation is
 now available for IPA, but that's a separate issue).

 Regards,   Martin.






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Asmus Freytag

On 7/16/2011 1:53 AM, Michael Everson wrote:

On 16 Jul 2011, at 04:37, Asmus Freytag wrote:


It's not a matter of competing views. There's a well-defined process for 
adding characters to the standard. It starts by documenting usage.

Yes, Asmus, and when one wants to do that, one writes a proposal. We aren't 
writing a proposal here. We're *talking* about things.


I fully understand the difference between making a formal proposal (that 
can be acted upon) and informally chatting about the possible needs for 
some characters - and the chances that a successful proposal might be 
written.


However, if the only hard information are assertions of personal 
preference such as Sometimes I might want to show a dotted box for NBSP 
and sometimes a real NBSP, it is a bit much to then conclude What I 
see is a certain unreasonability reflecting a certain conservatism 
because there isn't an immediate, public enthusiasm for the idea.


A./

PS: My counter-assertion, that much of the technical literature uses the 
abbreviations in preference to dashed boxes, has been pointedly ignored 
by you. UAX#9, bidi, and UAX#14, linebreak, extensively discuss 
invisible characters - neither of these documents needs symbol 
characters, in fact, they would probably reduce clarity. This practice 
goes back over 15 years, so it can be seen as settled. (I further 
assert that I expect examples could be found outside the standard as well).


PPS: If anybody provides evidence (suitably documented for the level 
of discussion) of widespread use of symbolic depictions for certain 
invisible characters, I'd be quite open to review it and to base my 
future position on this new basis.





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-16 Thread Asmus Freytag

Karl,

I've published similar surveys in the past, where the object was to 
get feedback on the desirability of further action. I stick by my 
recommendation in favor of keeping raw data out of the document 
registry and of doing the committee a favor by adding value in form of 
a sifting or analysis of such data.


Previewing the data is not the same as making a character encoding 
proposal, and there aren't any procedural rules for non-proposals, so 
there's nothing that prevents doing that. I have always provided some 
level of analysis, and I have not always chosen to register all such 
documents - for the reasons I gave you earlier.


The original rationale for encoding certain symbols had been their 
widespread use. The word widespread is key here. At the time that 
Unicode was first created, symbol sets associated with printers defined 
widespread use. After these sets were backed into the 2600 and 2700 
blocks, the phenomenal rise of Windows made the W/W-Dings sets even more 
widespread.


As you and WG2 evaluate additional such widely disseminated fonts, you 
will need to come up with your own criteria of what constitutes 
widespread. Those criteria should be applied both to the fonts 
considered as potential source of symbols, as well as to each category 
of symbols within these fonts.


I'll be interested in looking at a list of Apple symbols, once it's 
categorized a bit better by symbol function and / or gives a better idea 
of which (and how many) symbols extend existing sets (e.g. by adding 
directional variants) and which (and how many) might possibly be only 
variants of existing symbols - and similar information like that. 
(Unlike a full character encoding proposal I would not expect definite 
answers to these, but some tentative / approximate information would be 
nice).


A./



Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Karl Pentzlin
In WG2 N4085 Further proposed additions to ISO/IEC 10646 and comments to other 
proposals (2011‐
05‐25), the German NB had requested re WG2 N4022 Proposal to add Wingdings and 
Webdings
Symbols besides other points:
  Also, in doing this work, other fonts widespread on the computers of leading 
manufacturers (e.g.
  Apple) shall be included, thus avoiding the impression that Unicode or 
SC2/WG2 favor a single
  manufacturer.
In supporting this, there is now a quick survey of symbol fonts regularly 
delivered with computers
manufactured by Apple:
  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4127.pdf

- Karl




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Andrew West
On 15 July 2011 09:08, Karl Pentzlin karl-pentz...@acssoft.de wrote:

 In supporting this, there is now a quick survey of symbol fonts regularly 
 delivered with computers
 manufactured by Apple:
  http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4127.pdf

I am agnostic on all the symbols, but would say a definite No to
encoding graphic clones of all the format (gc=Cf), space (gc=Zs) and
separator (gc=Zl|Zp) characters shown on pages 3, 8 and 9 of that
document.  It is not necessary, and would set a bad precedent for
always encoding all format and space characters in duplicate, once as
a visible character and once as an invisible character.  If you want a
font to display a visible glyph for a format or space character then
you should just map the glyph to its character in the font, as many
fonts already do for certain format characters.

Andrew




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 1:08 AM, Karl Pentzlin wrote:

In WG2 N4085 Further proposed additions to ISO/IEC 10646 and comments to other 
proposals (2011‐
05‐25), the German NB had requested re WG2 N4022 Proposal to add Wingdings and 
Webdings
Symbols besides other points:
   Also, in doing this work, other fonts widespread on the computers of 
leading manufacturers (e.g.
   Apple) shall be included, thus avoiding the impression that Unicode or 
SC2/WG2 favor a single
   manufacturer.
In supporting this, there is now a quick survey of symbol fonts regularly 
delivered with computers
manufactured by Apple:
   http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4127.pdf

- Karl




 Karl,

I believe that publishing this document in its current form is a more of 
a disservice than a service to the committees or the larger community (a 
few individuals excepted).


There appear to be a large number of symbols for which a Unicode 
equivalent can be identified with great certainty - and beyond that 
there seem to be characters for which such an assignment is perhaps more 
tentative, because of minor glyph differences, but still plausible.


I believe that only when these two passes have been carried out, will 
the document be of any reasonable use to wider audiences - as it is, 
everybody has to sift through all the characters, even the ones that are 
uninteresting (because their mappings are not in question, despite lack 
of glyph names).


Using Unibook, you can use the syntactic conventions of  canonical and 
compatibility decomposition listings to show mappings of which you are 
certain or which look OK, but need verification. Entirely questionable 
mappings could use the comment convention.


In the input file used by Unibook, a TAB=SPACE at the start of a 
line, followed by a code point can be used to show an identically 
equal sign with the mapping in the output. A TAB%SPACE would show 
the approximately equal sign, and a TAB*SPACE would yield a bullet 
(as for a comment).


Finally, you could use yellow (and/or blue) highlighting (or both) to 
highlight characters needing particular levels of review.


Once you have carried the analysis to that stage, the document would 
indeed be of interest for wider reviewers. It would still not be a 
proposal, but you would have done the necessary legwork in *analyzing* 
(or tentatively analyzing) the repertoire.


A./


Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Karl Pentzlin
Am Freitag, 15. Juli 2011 um 10:58 schrieb Asmus Freytag:

AF ... There appear to be a large number of symbols for which a
AF Unicode equivalent can be identified with great certainty -
AF and beyond that there seem to be characters for which such
AF an assignment is perhaps more tentative, because of minor
AF glyph differences, but still plausible. ...
AF ... Once you have carried the analysis to that stage ...

My intent was to present the data to people who want to continue the
work in this way, and to encourage the discussion of the Apple symbols
within the Wingding/Webding discussion in line with the German NB request
cited in my original mail.
Such analysis as Asmus requested, done with the appropriate scrutiny
and thus requiring a considerable amount of time, in fact is the next
logical step on this work. This, however, has not necessarily to be done
by myself.

- Karl




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 09:47, Andrew West wrote:

 I am agnostic on all the symbols, but would say a definite No to encoding 
 graphic clones of all the format (gc=Cf), space (gc=Zs) and separator 
 (gc=Zl|Zp) characters shown on pages 3, 8 and 9 of that
 document.  It is not necessary, and would set a bad precedent for always 
 encoding all format and space characters in duplicate, once as a visible 
 character and once as an invisible character.

I disagree. The standard already has 
http://www.unicode.org/charts/PDF/U2400.pdf which is little different.

 If you want a font to display a visible glyph for a format or space character 
 then you should just map the glyph to its character in the font, as many 
 fonts already do for certain format characters.

Sometimes I might want to show a dotted box for NBSP and sometimes a real NBSP. 
Or many other characters. Or show a RTL and LTR override character without 
actually overriding the text. You'd need a picture for that, because just 
putting in a glyph for it would also override the text. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Alan Wood
I have web pages with lists of Unicode equivalents for Wingdings and Wingdings 
2 
characters, updated for Unicode 6.  These equivalents were chosen by me, and 
they are not in any way official Unicode mappings.

http://www.alanwood.net/demos/wingdings.html
http://www.alanwood.net/demos/wingdings-2.html

I do not have pages for Wingdings 3 or Webdings because I could find only a few 
Unicode equivalents.

Alan Wood
http://www.alanwood.net (Unicode, special characters, pesticide names) 



- Original Message 
 From: Karl Pentzlin karl-pentz...@acssoft.de
 To: Asmus Freytag asm...@ix.netcom.com
 Cc: unicode@unicode.org
 Sent: Fri, 15 July, 2011 10:23:57
 Subject: Re: Quick survey of Apple symbol fonts (in context of the 
Wingding/Webding proposal)
 
 Am Freitag, 15. Juli 2011 um 10:58 schrieb Asmus Freytag:
 
 AF ... There appear to be a large number of symbols for which a
 AF Unicode equivalent can be identified with great certainty -
 AF and beyond that there seem to be characters for which such
 AF an assignment is perhaps more tentative, because of minor
 AF glyph differences, but still plausible. ...
 AF ... Once you have carried the analysis to that stage ...
 
 My intent was to present the data to people who want to continue the
 work in this way, and to encourage the discussion of the Apple symbols
 within the Wingding/Webding discussion in line with the German NB request
 cited in my original mail.
 Such analysis as Asmus requested, done with the appropriate scrutiny
 and thus requiring a considerable amount of time, in fact is the next
 logical step on this work. This, however, has not necessarily to be done
 by myself.
 
 - Karl





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Martin J. Dürst



On 2011/07/15 18:51, Michael Everson wrote:

On 15 Jul 2011, at 09:47, Andrew West wrote:



If you want a font to display a visible glyph for a format or space character 
then you should just map the glyph to its character in the font, as many fonts 
already do for certain format characters.


Sometimes I might want to show a dotted box for NBSP and sometimes a real NBSP. 
Or many other characters. Or show a RTL and LTR override character without 
actually overriding the text. You'd need a picture for that, because just 
putting in a glyph for it would also override the text.


I understand the need. But then what happens is that we need a picture 
in the standard for the character that depicts an RLO (but isn't 
actually one). And then you need another character to show that 
picture, and so on ad infinitum. This doesn't scale.


If we take the needs of charaacter encoding experts when they write 
*about* characters to decide what to make a character, then we get many 
too many characters encoded. That's similar to the need of typographers 
when they talk about different character shapes. If we had encoded a 
Roman 'a' and an Italic 'a' separately just because the distinction 
shows up explicitly in some texts on typography, that would have been a 
mistake (the separation is now available for IPA, but that's a separate 
issue).


Regards,   Martin.



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 13:36, Martin J. Dürst wrote:

 If we take the needs of charaacter encoding experts when they write *about* 
 characters to decide what to make a character, then we get many too many 
 characters encoded. 


I think that having encoded symbols for control characters (which we already 
have for some of them) is no bad thing, and the argument about too many 
characters is not compelling, as there are only some dozens of these 
characters encoded, not thousands and thousands or anything. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Andrew West
On 15 July 2011 13:40, Michael Everson ever...@evertype.com wrote:

 I think that having encoded symbols for control characters (which we
 already have for some of them) is no bad thing, and the argument
 about too many characters is not compelling, as there are only some
 dozens of these characters encoded, not thousands and thousands or
 anything.

I oppose encoding graphic clones of non-graphic characters on
principle, not because of how many there are.  Nevertheless, there are
potentially a large number of characters for which people may wish to
have visible clones encoded: the 97 tag characters are format
characters, and may not be displayed under some systems (e.g. Windows
7); and although the 256 variation selector characters are non-spacing
marks rather than format characters, some systems won't display them
even if the font has visible glyphs mapped to the characters, so there
is an argument to encode visible clones of tag and vs characters so
that people can discuss their use in plain text.  I am not convinced
by such arguments.

Andrew



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread John H. Jenkins
I'll try to arrange for an official corporate response to this document for the 
next UTC, but informally, I note that the charts include a number of variants 
of the Apple corporate logo, which Apple wants *not* to be encoded in any form. 
 

Beyond this—and speaking purely for myself and not for Apple (and unfortunately 
aware that some people don't understand or will not respect the distinction)—I 
think that this whole discussion is starting up a little too quickly.  The mere 
fact that they're in fonts some corporation ships is not evidence that they are 
appropriate even for consideration, let alone encoding, particularly in the 
absence of clones or other widely-distributed fonts which contain these glyphs. 
 I think it's fair to say that if Apple felt that these glyphs were needed in 
general text interchange, Apple would have proposed them.  

In any event, I would personally prefer that the whole discussion be dropped 
until Apple has had a chance to at least look over the document and respond.  
To do otherwise strikes me as at the least discourteous and at best premature.  

=
井作恆
John H. Jenkins







RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Doug Ewell
Andrew West andrewcwest at gmail dot com replied to Michael Everson:

 I think that having encoded symbols for control characters (which we
 already have for some of them) is no bad thing, and the argument
 about too many characters is not compelling, as there are only some
 dozens of these characters encoded, not thousands and thousands or
 anything.
 
 I oppose encoding graphic clones of non-graphic characters on
 principle, not because of how many there are.

I agree with Michael about a lot of things, and this isn't going to be
one of them.  The main arguments I am seeing in favor of encoding are:

1. Graphic symbols for control characters are needed so writers can
write about the control characters themselves using plain text.

I don't think there's any end to where this can go.  As Martin said,
eventually you'd need a meta-meta-character to talk about the
meta-character, and then it's not just a size problem, but an
infinite-looping problem.

2. The precedent was established by the U+2400 block.

I thought those were compatibility characters, in the original sense:
encoded because they were part of some pre-existing standard.  That's
not necessarily a precedent in itself to encode more characters that are
similar in nature.

3. There aren't that many of them.

We regularly dismiss arguments of the form But there's lots of room for
these in Unicode when someone proposes to encode something that
shouldn't be there.  I don't see this as any different.

Michael is responsible for adding many thousands of characters to
Unicode, so it's awkward for me to be debating character-encoding
principles with him, but there we are.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Karl Pentzlin
Am Freitag, 15. Juli 2011 um 15:08 schrieb Andrew West:

AW I oppose encoding graphic clones of non-graphic characters ...

I am just waiting for the killer argument against the encoding of
chart symbols.

They are not clones, but characters by themselves, naming different entities
(invisible characters in this case). Thus, the chart symbol
characters are clearly distinctive of the invisible characters they
name.
U+240A SYMBOL FOR LINE FEED is no line feed character and has no
different line breaking behavior as any other symbol character.
The chart symbols are perfect characters in the way that they have concise
semantics, a well defined glyph spectrum, and appear in plain text
(e.g. discussions of the invisible characters they name).

Am Freitag, 15. Juli 2011 um 14:36 schrieb Martin J. Dürst:

MJD I understand the need. But then what happens is that we need a picture
MJD in the standard for the character that depicts an RLO (but isn't 
MJD actually one). And then you need another character to show that 
MJD picture, and so on ad infinitum.

No, this is a sophism and not a real-world argument.
The chart symbols are visible characters like e.g. any Latin letters.
Nobody until now has proposed any character symbolizing a clearly visible
and identifiable character, such as Symbol for Latin Capital A.
If in fact somebody proposes such, this would be a completely different topic,
and the arguments to do such (if in fact any are found) will also be
completely different.

- Karl






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 9:03 AM, Doug Ewell wrote:

Andrew Westandrewcwest at gmail dot com  replied to Michael Everson:


I think that having encoded symbols for control characters (which we
already have for some of them) is no bad thing, and the argument
about too many characters is not compelling, as there are only some
dozens of these characters encoded, not thousands and thousands or
anything.

I oppose encoding graphic clones of non-graphic characters on
principle, not because of how many there are.

I agree with Michael about a lot of things, and this isn't going to be
one of them.  The main arguments I am seeing in favor of encoding are:

1. Graphic symbols for control characters are needed so writers can
write about the control characters themselves using plain text.


When users outside the character encoding community start reporting such 
a need in great numbers, it would indicate that there might (might!) be 
a real requirement. The character coding community has had decades to 
figure out ways to manage without this - and the current occasion 
(review of Apple's symbol fonts) is not a suitable context to suddenly 
drag in something that could have been addressed anytime for the last 20 
years, if it had been really urgent.


I don't think there's any end to where this can go.  As Martin said,
eventually you'd need a meta-meta-character to talk about the
meta-character, and then it's not just a size problem, but an
infinite-looping problem.


What real users need is to show hidden characters. That need can be 
served with different mechanisms. There seems to not be a consensus 
though, on what the preferred approach should be and implementations 
disagree. That kind of issue needs to be addressed differently, 
involving the cooperation of major implementers.




2. The precedent was established by the U+2400 block.

I thought those were compatibility characters, in the original sense:
encoded because they were part of some pre-existing standard.  That's
not necessarily a precedent in itself to encode more characters that are
similar in nature.


Doug is entirely correct. These are a precedent only if an extended set 
of other such symbols was found in use in some de-facto character set. 
In that special case, an argument for compatibility with *that* 
character set could be made. And for that to be successful, it would 
have to be shown that the character set is widely used and compatibility 
to it is of critical importance.


In addition, I claim, experience has shown that the the control code 
image characters are not widely used. That means, any hope that the 
early encoders (and these go back to 1.0) may have had that those 
symbols are useful characters in their own right, simply have not been 
borne out.




3. There aren't that many of them.

We regularly dismiss arguments of the form But there's lots of room for
these in Unicode when someone proposes to encode something that
shouldn't be there.  I don't see this as any different.


Correct.

The only time this argument is useful is in deciding between encoding 
the same character directly or as character sequence. Using character 
sequences solely because of encoding space reasons, as opposed to the 
reason that the elements are characters in their own right, has become 
irrelevant due to the introduction of 16 more planes.


The same is true for excessive unification of certain symbols or 
punctuation characters: saving code space is not a valid argument here - 
so any decision needs to be based on other facts.


Michael is responsible for adding many thousands of characters to
Unicode, so it's awkward for me to be debating character-encoding
principles with him, but there we are.





Well, in this business, no-one's infallible.

A./



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 17:03, Doug Ewell wrote:

 1. Graphic symbols for control characters are needed so writers can write 
 about the control characters themselves using plain text.

This does not seem so unreasonable. The RTL and LTR overrides *function* on the 
text when inserted into text. So you can't use those with glyphs in a font to 
represent for example the UCS dotted-boxes-with-letters, because they are 
control characters and will affect the text. 

 I don't think there's any end to where this can go.  As Martin said, 
 eventually you'd need a meta-meta-character to talk about the meta-character, 
 and then it's not just a size problem, but an infinite-looping problem.

I do not follow the logic of this assertion. SPACE and SYMBOL FOR SPACE exist. 
No infinite recursion is needed. 

Michael Everson * http://www.evertype.com/





RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote:

 1. Graphic symbols for control characters are needed so writers can write 
 about the control characters themselves using plain text.

 This does not seem so unreasonable. The RTL and LTR overrides *function* on 
 the text when inserted into text. So you can't use those with glyphs in a 
 font to represent for example the UCS dotted-boxes-with-letters, because they 
 are control characters and will affect the text.

Do people really need assigned characters (not just glyphs) to represent
these things, instead of just talking about them?  I see text all the
time that refers to characters using the name of the character, or its
U+ value, or some informal name or descriptive phrase like the RTL and
LTR overrides.  How common is the need to have a discrete character to
talk about another character?

 I don't think there's any end to where this can go.  As Martin said, 
 eventually you'd need a meta-meta-character to talk about the 
 meta-character, and then it's not just a size problem, but an 
 infinite-looping problem.

 I do not follow the logic of this assertion. SPACE and SYMBOL FOR SPACE 
 exist. No infinite recursion is needed. 

How do I talk about U+2420 SYMBOL FOR SPACE in plain text?  Other than
the way I just did, I mean.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­






RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Erkki I Kolehmainen
I'd assume that you could talk about it by referring to its name and/or code
point. A visible symbol for it would be new and would not be recognizable as
such.

Erkki 

-Alkuperäinen viesti-
Lähettäjä: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org]
Puolesta Michael Everson
Lähetetty: 15. heinäkuuta 2011 20:26
Vastaanottaja: unicode Unicode Discussion
Aihe: Re: Quick survey of Apple symbol fonts (in context of the
Wingding/Webding proposal)

What I see is a certain unreasonability reflecting a certain conservatism.
Text about the Standard is important, and should be representable in an
interchangeable way. Here { } is a Right to left override character.
character. I want to talk about it in a way that is visible. Oops. I can't
do it interchangeably. 

Michael Everson * http://www.evertype.com/







Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 18:37, Doug Ewell wrote:

 Do people really need assigned characters (not just glyphs) to represent 
 these things, instead of just talking about them?  I see text all the time 
 that refers to characters using the name of the character, or its U+ value, 
 or some informal name or descriptive phrase like the RTL and LTR overrides. 
  How common is the need to have a discrete character to talk about another 
 character?

I've been trying to represent a Duployan keyboard layout description and yes, I 
do need glyphs for some of these characters. 

 I do not follow the logic of this assertion. SPACE and SYMBOL FOR SPACE 
 exist. No infinite recursion is needed. 
 
 How do I talk about U+2420 SYMBOL FOR SPACE in plain text?  Other than the 
 way I just did, I mean.

How do I talk about U+0044 LATIN CAPITAL LETTER D in plain text? I use the 
graphic character D. It's not an invisible character. 

To talk about U+2420, you use the graphic symbol U+2420 ␠. That is however not 
an answer to my complaint that encoding a SYMBOL FOR something otherwise 
invisible implies an infinite recursion of other characters. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 2:23 AM, Karl Pentzlin wrote:

Am Freitag, 15. Juli 2011 um 10:58 schrieb Asmus Freytag:

AF  ... There appear to be a large number of symbols for which a
AF  Unicode equivalent can be identified with great certainty -
AF  and beyond that there seem to be characters for which such
AF  an assignment is perhaps more tentative, because of minor
AF  glyph differences, but still plausible. ...
AF  ... Once you have carried the analysis to that stage ...

My intent was to present the data to people who want to continue the
work in this way, and to encourage the discussion of the Apple symbols
within the Wingding/Webding discussion in line with the German NB request
cited in my original mail.


You would serve this goal much better if, instead of rushing to simply 
add raw data to the document pile, you had narrowed the issue down by 
limiting this further to characters that need real scrutiny.



Such analysis as Asmus requested, done with the appropriate scrutiny
and thus requiring a considerable amount of time, in fact is the next
logical step on this work. This, however, has not necessarily to be done
by myself.


So, essentially you are dumping it on everyone.

At this early stage (raw list) a better approach would have been to look 
for collaborators first and then collectively publish a document that 
provides useful analysis.


The document registry should be limited to documents that can and should 
be reviewed in committee. Raw data collection without or with limited 
value added do not belong, in my view.


A./

PS: I feel strongly enough about this that I will not review the 
document in its current stage.




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 18:50, Erkki I Kolehmainen wrote:

 I'd assume that you could talk about it by referring to its name and/or code 
 point. A visible symbol for it would be new and would not be recognizable as 
 such.

In the code charts it has a glyph. Without a SYMBOL FOR character for this 
control character, it's not possible to represent the glyph of that character 
in the code charts. 

You can't represent an invisible character visibly. Sure. But the glyph in the 
code chart is something that can be talked about. I might do it in a PUA font. 
But that can't be interchanged. So if I want a web page for instance to 
describe how invisible characters affect Devanagari character shaping, I can't 
do it with a graphic character. Even though the text of the standard may do so 
using that graphic character inline in text. 

Michael Everson * http://www.evertype.com/





RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Doug Ewell
 What I see is a certain unreasonability reflecting a certain conservatism. 
 Text about the Standard is important, and should be representable in an 
 interchangeable way. Here { } is a Right to left override character. 
 character. I want to talk about it in a way that is visible. Oops. I can't do 
 it interchangeably. 

[RTL] or {RTL} or Right-to-Left Override or U+202E might all
work.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 10:26 AM, Michael Everson wrote:

What I see is a certain unreasonability reflecting a certain conservatism. Text 
about the Standard is important, and should be representable in an 
interchangeable way. Here { } is a Right to left override character. character. 
I want to talk about it in a way that is visible. Oops. I can't do it 
interchangeably.



Michael,

let me give you an example:

The Unicode Bidi Algorithm has extensive need to discuss this character, 
because it provides specification for its use and support by 
implementations. If you look at that document (UAX#9), you find this 
character discussed widely (and you can save that document to plain text 
without losing the sense of that discussion).


This example illustrates that we need to distinguish between the 
requirement to *discuss *characters and their use, and the perceived 
need to use *symbolic images* (glyphs) to do so. As the example of UAX#9 
shows, one does not follow from the other.


If there had been a universal requirement to use glyphs for this 
purpose, this requirement would have surfaced and could have been 
addressed anytime during the last 20 years. Another indication that this 
is not a universal requirement can be deduced from the fact that these 
glyphs do not show up in more font collections.


Several  symbols for space or blank were added however, because 
widespread use in documentation was attested. The same avenue should in 
principle be open for other such symbols (and here I disagree with 
Andrew and Martin): If widespread use of glyphic symbols (as opposed to 
abbreviations and names) can be documented for some characters, then 
those characters, and those characters only should have whatever symbol 
is used to represent them, added to the standard. Also, like the example 
for SPACE, if there are different symbols, any of them that is 
widespread should be added - to unify symbols of different design based 
on the underlying concept that they represent would constitute improper 
unification, in my view.


So, there, I'm not at all unreasonable - I just reasonably ask that the 
normal procedures for adding characters are to be followed.


In this particular case, the Apple glyphs include glyphs for format 
characters that Unicode considers deprecated. Providing characters to 
encode glyphs for them would just be a waste. Further, while the glyphs 
shown match those from the Unicode code charts, they are not necessarily 
the shapes that are displayed when systems want to show these invisible 
characters - so users and documentation writers may need an entirely 
different set of glyphs. Finally, other vendors seem to not have 
endorsed these glyphs by including them in their font collections - much 
unlike the emoji, where multiple vendors had a large overlap of symbols, 
and with large overlap in glyphic representation as well.


Therefore, I strongly urge the committees to separate out these meta 
characters from the ongoing *symbol collection* review.
They can be taken up based on evidence of actual use (and showing the 
actual glyphs in such use) at a later occasion.


A./


Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
On 15 Jul 2011, at 18:48, Asmus Freytag wrote:

 You would serve this goal much better if, instead of rushing to simply add 
 raw data to the document pile, you had narrowed the issue down by limiting 
 this further to characters that need real scrutiny.

Your point was taken the first time. No need to bash Karl. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Mark E. Shoulson

On 07/15/2011 01:37 PM, Doug Ewell wrote:

How do I talk about U+2420 SYMBOL FOR SPACE in plain text?  Other than
the way I just did, I mean.


This infinite recursion argument doesn't hold up.  One can see the 
need for a graphical representation (which does not mess with layout) of 
characters that are not graphically represented and/or which mess with 
layout.  If I need to talk about RTO I need to mention it and not use 
it; I need something I can see.  But if I need to talk about a LATIN 
LETTER A, I can simply use the character as-is, because it is graphical 
and doesn't mess up layout.


Karl Pentzlin said this already, and correctly: if you're worried about 
infinite regress here, then you should worry about it for EVERY 
character out there.  After all, if we need a special symbol for SYMBOL 
FOR RLO so we can talk about it, don't we also need a special symbol 
for LATIN CAPITAL LETTER A?  And then of course we'll also need a 
special symbol for SYMBOL FOR LATIN CAPITAL LETTER A and so ad infinitum.


Other arguments for or against there might be; infinite regress is a 
non-issue here.


~mark



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Michael Everson
Look at Figures 8-1 through 8-4 in the Unicode Standard 5.0.

We see graphic characters shown, one representing space and two representing 
joiners. This is plain text. This is something one might wish to put on a web 
page or in an e-mail. One of the three characters is encoded. 

Talking about the standard is *important*. Since the use of graphic characters 
in plain text is often cited as a criterion for encoding, and since some 
non-graphic characters in the standard have a SYMBOL FOR graphic 
representation, I do not, at all, think that it is unwise or capricious to 
suggest that other non-graphic characters in the standard also have a SYMBOL 
FOR graphic character which can be used to represent them. 

In fact, I think it would be advantageous to users of the standard and to 
promulgators of the standard for such symbols to be encoded. 

However, I agree with Asmus that in the context of the Wingdings-type symbols 
these characters should not be considered. They should be considered as a whole 
on their own. 

Michael Everson * http://www.evertype.com/





Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread John W Kennedy
On Jul 15, 2011, at 2:29 PM, Mark E. Shoulson wrote:
 On 07/15/2011 01:37 PM, Doug Ewell wrote:
 How do I talk about U+2420 SYMBOL FOR SPACE in plain text?  Other than
 the way I just did, I mean.
 
 This infinite recursion argument doesn't hold up.

Those of us old enough to recall IBM's old 6-bit BCDIC code (a retronym -- it 
was known as BCD in its own day) will remember the overstricken b/ character 
used to represent the Substitute Blank character, the overstricken =| character 
for Record Mark, and others. (Annoyingly enough, these and some other BCDIC 
graphics are not covered by Unicode, which must be a problem for historians.) I 
cannot bring to mind any infinite regress happening at the time.

(The Substitute Blank, for the curious, was used when recording character data 
on 7-track tape with even parity. The standard Blank character, all zeroes, 
could not be safely recorded, so it was translated to Substitute Blank, and 
Substitute Blank was translated back to Blank on reading. Binary 7-track tape 
was recorded with odd parity to avoid this problem, but was less reliable than 
even parity.)

-- 
John W Kennedy
Though a Rothschild you may be
In your own capacity,
As a Company you've come to utter sorrow--
But the Liquidators say,
'Never mind--you needn't pay,'
So you start another company to-morrow!
  -- Sir William S. Gilbert.  Utopia Limited






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Leo Broukhis
On Fri, Jul 15, 2011 at 12:04 PM, John W Kennedy jwke...@attglobal.net wrote:
 Those of us old enough to recall IBM's old 6-bit BCDIC code (a retronym -- it 
 was known as BCD in its own day) will remember the overstricken b/ 
 character used to represent the Substitute Blank character, the overstricken 
 =| character for Record Mark, and others. (Annoyingly enough, these and some 
 other BCDIC graphics are not covered by Unicode, which must be a problem for 
 historians.)

There's U+2422 BLANK SYMBOL ␢ and U+241E SYMBOL FOR RECORD SEPARATOR ␞
Are they not enough?

Leo




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Petr Tomasek
On Fri, Jul 15, 2011 at 09:03:38AM -0700, Doug Ewell wrote:
 Andrew West andrewcwest at gmail dot com replied to Michael Everson:
 
  I think that having encoded symbols for control characters (which we
  already have for some of them) is no bad thing, and the argument
  about too many characters is not compelling, as there are only some
  dozens of these characters encoded, not thousands and thousands or
  anything.
  
  I oppose encoding graphic clones of non-graphic characters on
  principle, not because of how many there are.
 
 I agree with Michael about a lot of things, and this isn't going to be
 one of them.  The main arguments I am seeing in favor of encoding are:
 
 1. Graphic symbols for control characters are needed so writers can
 write about the control characters themselves using plain text.
 
 I don't think there's any end to where this can go.  As Martin said,
 eventually you'd need a meta-meta-character to talk about the
 meta-character, and then it's not just a size problem, but an
 infinite-looping problem.

Could you point a case where such a meta-meta... characters could be
used? I see that there is a lot of technical literature / documentation /
et cetera where one would use a visible representation of invisible character.
I don't really see a reason why should someone need a visible representation
of already visible glyph.

 3. There aren't that many of them.
 
 We regularly dismiss arguments of the form But there's lots of room for
 these in Unicode when someone proposes to encode something that
 shouldn't be there.  I don't see this as any different.

Well, what about adding just ONE more escaping character that
would make the following control code point be displayed?

P.T.

-- 
Petr Tomasek http://www.etf.cuni.cz/~tomasek
Jabber: but...@jabbim.cz


EA 355:001  DU DU DU DU
EA 355:002  TU TU TU TU
EA 355:003  NU NU NU NU NU NU NU
EA 355:004  NA NA NA NA NA






Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Julian Bradfield
On 2011-07-15, Leo Broukhis l...@mailcom.com wrote:
 On Fri, Jul 15, 2011 at 12:04 PM, John W Kennedy jwke...@attglobal.net 
 wrote:
 Those of us old enough to recall IBM's old 6-bit BCDIC code (a retronym -- 
 it was known as BCD in its own day) will remember the overstricken b/ 
 character used to represent the Substitute Blank character, the overstricken 
 =| character for Record Mark, and others. (Annoyingly enough, these and some 
 other BCDIC graphics are not covered by Unicode, which must be a problem for 
 historians.)

 There's U+2422 BLANK SYMBOL ␢ and U+241E SYMBOL FOR RECORD SEPARATOR ␞
 Are they not enough?

And of course there are other ways: if the Record Mark John is
referring to is the same as the Group Mark in the table I find, it's
actually ≡⃒ , not =⃒ (these two using U+20D2 COMBINING LONG VERTICAL
LINE OVERLAY); the latter could also be represnted as ǂ, the palatal
click).

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.




Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Ken Whistler


  
  
On 7/15/2011 11:36 AM, Michael Everson wrote:

  Look at Figures 8-1 through 8-4 in the Unicode Standard 5.0.

We see graphic characters shown, one representing space and two representing joiners. This is plain text. 

Bt. Thanks for playing! But the correct answer is that it is not
plain text. And
what you see are not graphic characters, but glyphs arranged in a
formatted
figure.


  This is something one might wish to put on a web page or in an e-mail.


As well one might:



   One of the three characters is encoded.


Michael is referring to the little bridge symbol there, which is
used to represent the
presence of a space, and which is encoded as U+2423 OPEN BOX. Note
that
that is different from U+2420 SYMBOL FOR SPACE, which is the kind of
generic visible symbol for invisible control codes that are in
question here.

As for the others, those are chart glyphs for the ZWNJ and the ZWJ.
There is
no need to encode *characters* for chart glyphs.


   

Talking about the standard is *important*. Since the use of graphic characters in plain text is often cited as a criterion for encoding, and since some non-graphic characters in the standard have a SYMBOL FOR graphic representation, I do not, at all, think that it is unwise or capricious to suggest that other non-graphic characters in the standard also have a SYMBOL FOR graphic character which can be used to represent them.


I don't think anybody is claiming capriciousness here, but having
such symbols encoded as
characters is definitely *unnecessary* for the standard. As Asmus
has already pointed out,
we have been successfully talking about such characters in the
standard for
20 years now. There are half a dozen ways to do so, some using plain
text,
and others using rich text and images. 

  
In fact, I think it would be advantageous to users of the standard and to promulgators of the standard for such symbols to be encoded.


And I rather think not. Asmus' analysis was spot on.

--Ken



  



Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 11:05 AM, Doug Ewell wrote:

What I see is a certain unreasonability reflecting a certain conservatism. Text 
about the Standard is important, and should be representable in an 
interchangeable way. Here { } is a Right to left override character. character. 
I want to talk about it in a way that is visible. Oops. I can't do it 
interchangeably.

[RTL] or {RTL} or Right-to-Left Override or U+202E might all
work.



The conventional abbreviations are:

RLO (Right-to-left override)
RLE (Right-to-left embedding)
RLM (Right-to-left mark)


--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell ­










Re: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Asmus Freytag

On 7/15/2011 11:36 AM, Michael Everson wrote:

However, I agree with Asmus that in the context of the Wingdings-type symbols 
these characters should not be considered. They should be considered as a whole 
on their own.


Thank you Michael.

To reiterate and restate (so it can be read out of context):

   If widespread use of particular glyphic symbols for certain
   invisible characters (as opposed to abbreviations and names) can be
   documented, then those symbols, and those symbols only should are
   eligible to be added to the standard. As in the example for SPACE,
   if there are different such symbols denoting the same invisible
   character, any of them that is widespread could be added. Care
   should be taken not to unify symbols of different design merely
   based on the fact that they represent the same invisible character.


I simply ask that when and if these symbol characters are considered, 
the normal procedures for adding characters are to be followed. This 
includes adducing evidence of their use in documentation (other than the 
Unicode Standard itself) and similar publications. In particular, such 
documentation would need to be brought for each individual character 
(except perhaps for paired characters) as it is quite likely that some 
invisible characters not documented extensively  (for example the 
deprecated ones).


Finally, it would be valuable if research into the use of such glyphic 
symbols was thorough enough to encompass a more or less complete range 
of glyphs used for each invisible character, not simply the Unicode 
chart glyph.


A./



RE: Quick survey of Apple symbol fonts (in context of the Wingding/Webding proposal)

2011-07-15 Thread Erkki I Kolehmainen
FYI: In BCD the Record Mark (A82) and the Group Mark (BA8421) were separate 
control characters. 
As shown, there should be no problem in representing their symbols in Unicode 
plain text.

Erkki

-Alkuperäinen viesti-
Lähettäjä: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] 
Puolesta Julian Bradfield
Lähetetty: 15. heinäkuuta 2011 23:32
Vastaanottaja: unicode@unicode.org
Aihe: Re: Quick survey of Apple symbol fonts (in context of the 
Wingding/Webding proposal)

On 2011-07-15, Leo Broukhis l...@mailcom.com wrote:
 On Fri, Jul 15, 2011 at 12:04 PM, John W Kennedy jwke...@attglobal.net 
 wrote:
 Those of us old enough to recall IBM's old 6-bit BCDIC code (a retronym -- 
 it was known as BCD in its own day) will remember the overstricken b/ 
 character used to represent the Substitute Blank character, the overstricken 
 =| character for Record Mark, and others. (Annoyingly enough, these and some 
 other BCDIC graphics are not covered by Unicode, which must be a problem for 
 historians.)

 There's U+2422 BLANK SYMBOL ␢ and U+241E SYMBOL FOR RECORD SEPARATOR ␞
 Are they not enough?

And of course there are other ways: if the Record Mark John is
referring to is the same as the Group Mark in the table I find, it's
actually ≡⃒ , not =⃒ (these two using U+20D2 COMBINING LONG VERTICAL
LINE OVERLAY); the latter could also be represnted as ǂ, the palatal
click).

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.