Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread Philippe Verdy
2015-05-30 10:47 GMT+02:00 William_J_G Overington wjgo_10...@btinternet.com
:

 Responding to Doug Ewell:

  I think this cuts to the heart of what people have been trying to say
 all along.

  Historically, Unicode was not meant to be the means by which brand new
 ideas are run up the proverbial flagpole to see if they will gain traction.

 History is interesting and can be a good guide, yet many things that are
 an accepted part of Unicode today started as new ideas that gained traction
 and became implemented. So history should not be allowed to be a reason to
 restrict progress.

 For example, there was the extension from 1 plane to 17 planes.


Actually this was a restriction of the UCS to *only* 17 planes. Before that
the UCS contained 31-bit code points, i.e. 32768 planes !

If you're speaking about the old Unicode 1.0 it was then still not the UCS
and it was then incompatible with the UCS for many important parts, and the
initial targets of Unicode was only to have an industry standard
immediately usable between a few software providers (Unicode 1.0 was then
not an international standard, forget it !).


RE: Bunny hill symbol, used in America for signaling ski pistes for novices

2015-05-30 Thread Shawn Steele
I’m really curious to see one of these signs.  Is it a regional thing?

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Leonardo Boiko
Sent: Thursday, May 28, 2015 1:02 PM
To: Philippe Verdy
Cc: unicode Unicode Discussion
Subject: Re: Bunny hill symbol, used in America for signaling ski pistes for 
novices

You could use U+1F407 RABBIT combined with U+20E4 COMBINING ENCLOSING UPWARD 
POINTING TRIANGLE, and pretend the triangle is a hill.   ⃤
If only we had a combining rabbit, we could add rabbits to U+1F3D4 SNOW CAPPED 
MOUNTAIN.  Or anything else.

2015-05-28 16:46 GMT-03:00 Philippe Verdy 
verd...@wanadoo.frmailto:verd...@wanadoo.fr:
Is there a symbol that can represent the Bunny hill symbol used in North 
America and some other American territories with mountains, to designate the 
ski pistes open to novice skiers (those pistes are signaled with green signs in 
Europe).

I'm looking for the symbol itself, not the color, or the form of the sign.

For example blue pistes in Europe are designed with a green circle in America, 
but we have a symbol for the circle; red pistes in Europe are signaled by a 
blue square in America, but we have a symbol for the square; black pistes in 
Europe are signaled by a black diamond in America, but we also have such 
black diamond in Unicode.

But I can't find an equivalent to the American Bunny hill signal, equivalent 
to green pistes in Europe (this is a problem for webpages related to skiing: do 
we have to embed an image ?).




Re: Re: Bunny hill symbol, used in America for signaling ski pistes for novices

2015-05-30 Thread Philippe Verdy
But observations show that the vertical stacking is not universal.
Horizontal stacking is also used in direction signs. My opinion is that
they are just two separate diamonds and not a single symbol.

Quite equivalent to the situation with the classification of hotels with
stars (generally aligned horizontally but not always, we can see them also
arranged vertically, or on two rows 1+1, 1+2 or 2+1 or 2+3 or 3+2...)

I don't think the exact layout of individual symbols (diamond, star, ...)
is semantically significant, only their number is important  (and the fact
they are grouped together on the same medium with the same
foreground/background colors or tecturing and the same sizes).

2015-05-29 9:32 GMT+02:00 Jörg Knappen jknap...@web.de:

 From the description of the symbol it looks like a geometric shape. I
 think it is worth to be encoded as a geometric shape (TWO BLACK DIAMONDS
 VERTICALLY STACKED or something like this) with a note * bunny hill. It may
 have (r find in future) other uses.

 --Jörg Knappen

 *Gesendet:* Donnerstag, 28. Mai 2015 um 23:20 Uhr
 *Von:* Shervin Afshar shervinafs...@gmail.com
 *An:* Shawn Steele shawn.ste...@microsoft.com
 *Cc:* verd...@wanadoo.fr verd...@wanadoo.fr, unicode Unicode
 Discussion unicode@unicode.org, Jim Melton jim.mel...@oracle.com
 *Betreff:* Re: Bunny hill symbol, used in America for signaling ski
 pistes for novices
  Since the double-diamond has map and map legend usage, it might be a
 good idea to have it encoded separately. I know that I'm stating the
 obvious here, but the important point is doing the research and showing
 that it has widespread usage.

  ↪ Shervin

 On Thu, May 28, 2015 at 2:15 PM, Shawn Steele shawn.ste...@microsoft.com
 wrote:

  I’m used to them being next to each other.  So the entire discussion
 seems to be about how to encode a concept vs how to get the shape you want
 with existing code points.   If you just want the perfect shape, then maybe
 an svg is a better choice.  If we’re talking about describing ski-run
 difficulty levels in plain-text, then the hodge-podge of glyphs being
 offered in this thread seems kinda hacky to me.



 -Shawn



 *From:* ver...@gmail.com [mailto:ver...@gmail.com] *On Behalf Of *Philippe
 Verdy
 *Sent:* Thursday, May 28, 2015 2:12 PM
 *To:* Jim Melton
 *Cc:* Shawn Steele; unicode Unicode Discussion
 *Subject:* Re: Bunny hill symbol, used in America for signaling ski
 pistes for novices



 Some documentations also suggest that the two diamonds are not stacked
 one above the other, but horizontally. It's a good point for using only one
 symbol, encoding it twice in plain-text if needed.



 2015-05-28 22:15 GMT+02:00 Jim Melton jim.mel...@oracle.com:

  I no longer ski, but I did so for many years, mostly (but not
 exclusively) in the western United States.  I never encountered, at any USA
 ski hill/mountain/resort, a special symbol for bunny hills, which are
 typically represented by the green circle meaning beginner.  That's
 anecdotal evidence at best, but my observations cover numerous skiing
 sites.  I have encountered such a symbol in Europe and in New Zealand, but
 not in the USA.  (I have not had the pleasure of skiing in Canada and am
 thus unable to speak about ski areas in that country.)

 The double black diamond would appear to be a unique symbol worthy of
 encoding, simply because the only valid typographical representation (in
 the USA) is two single black diamonds stacked one above the other and
 touching at the points.

 Hope this helps,
Jim


 On 5/28/2015 2:04 PM, Shawn Steele wrote:

  So is double black diamond a separate symbol?  Or just two of the black
 diamond?



 And Blue-Black?



 I’m drawing a blank on a specific bunny sign, in my experience those are
 usually just green.



 Aren’t there a lot of cartography symbols for various systems that aren’t
 present in Unicode?



 *From:* Unicode [mailto:unicode-boun...@unicode.org
 http://unicode-boun...@unicode.org] *On Behalf Of *Philippe Verdy
 *Sent:* Thursday, May 28, 2015 12:47 PM
 *To:* unicode Unicode Discussion
 *Subject:* Bunny hill symbol, used in America for signaling ski pistes
 for novices



 Is there a symbol that can represent the Bunny hill symbol used in
 North America and some other American territories with mountains, to
 designate the ski pistes open to novice skiers (those pistes are signaled
 with green signs in Europe).



 I'm looking for the symbol itself, not the color, or the form of the sign.



 For example blue pistes in Europe are designed with a green circle in
 America, but we have a symbol for the circle; red pistes in Europe are
 signaled by a blue square in America, but we have a symbol for the square;
 black pistes in Europe are signaled by a black diamond in America, but we
 also have such black diamond in Unicode.



 But I can't find an equivalent to the American Bunny hill signal,
 equivalent to green pistes in Europe (this is a problem for webpages
 related to skiing: do we have 

RE: Re: Bunny hill symbol, used in America for signaling ski pistes for novices

2015-05-30 Thread Shawn Steele
I guess it depends on what you’re representing.  If it is the concept of 
“double black”, then maybe a separate symbol and the “font” or other selectors 
determine if it’s vertically or horizontally rendered.

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Philippe Verdy
Sent: Saturday, May 30, 2015 2:56 PM
To: Jörg Knappen
Cc: Shervin Afshar; unicode Unicode Discussion
Subject: Re: Re: Bunny hill symbol, used in America for signaling ski pistes 
for novices

But observations show that the vertical stacking is not universal. Horizontal 
stacking is also used in direction signs. My opinion is that they are just two 
separate diamonds and not a single symbol.

Quite equivalent to the situation with the classification of hotels with stars 
(generally aligned horizontally but not always, we can see them also arranged 
vertically, or on two rows 1+1, 1+2 or 2+1 or 2+3 or 3+2...)

I don't think the exact layout of individual symbols (diamond, star, ...) is 
semantically significant, only their number is important  (and the fact they 
are grouped together on the same medium with the same foreground/background 
colors or tecturing and the same sizes).

2015-05-29 9:32 GMT+02:00 Jörg Knappen 
jknap...@web.demailto:jknap...@web.de:
From the description of the symbol it looks like a geometric shape. I think it 
is worth to be encoded as a geometric shape (TWO BLACK DIAMONDS VERTICALLY 
STACKED or something like this) with a note * bunny hill. It may have (r find 
in future) other uses.

--Jörg Knappen

Gesendet: Donnerstag, 28. Mai 2015 um 23:20 Uhr
Von: Shervin Afshar shervinafs...@gmail.commailto:shervinafs...@gmail.com
An: Shawn Steele 
shawn.ste...@microsoft.commailto:shawn.ste...@microsoft.com
Cc: verd...@wanadoo.frmailto:verd...@wanadoo.fr 
verd...@wanadoo.frmailto:verd...@wanadoo.fr, unicode Unicode Discussion 
unicode@unicode.orgmailto:unicode@unicode.org, Jim Melton 
jim.mel...@oracle.commailto:jim.mel...@oracle.com
Betreff: Re: Bunny hill symbol, used in America for signaling ski pistes for 
novices
Since the double-diamond has map and map legend usage, it might be a good idea 
to have it encoded separately. I know that I'm stating the obvious here, but 
the important point is doing the research and showing that it has widespread 
usage.

↪ Shervin

On Thu, May 28, 2015 at 2:15 PM, Shawn Steele 
shawn.ste...@microsoft.comhttp://shawn.ste...@microsoft.com wrote:
I’m used to them being next to each other.  So the entire discussion seems to 
be about how to encode a concept vs how to get the shape you want with existing 
code points.   If you just want the perfect shape, then maybe an svg is a 
better choice.  If we’re talking about describing ski-run difficulty levels in 
plain-text, then the hodge-podge of glyphs being offered in this thread seems 
kinda hacky to me.

-Shawn

From: ver...@gmail.comhttp://ver...@gmail.com 
[mailto:ver...@gmail.comhttp://ver...@gmail.com] On Behalf Of Philippe Verdy
Sent: Thursday, May 28, 2015 2:12 PM
To: Jim Melton
Cc: Shawn Steele; unicode Unicode Discussion
Subject: Re: Bunny hill symbol, used in America for signaling ski pistes for 
novices

Some documentations also suggest that the two diamonds are not stacked one 
above the other, but horizontally. It's a good point for using only one symbol, 
encoding it twice in plain-text if needed.

2015-05-28 22:15 GMT+02:00 Jim Melton 
jim.mel...@oracle.comhttp://jim.mel...@oracle.com:
I no longer ski, but I did so for many years, mostly (but not exclusively) in 
the western United States.  I never encountered, at any USA ski 
hill/mountain/resort, a special symbol for bunny hills, which are typically 
represented by the green circle meaning beginner.  That's anecdotal evidence 
at best, but my observations cover numerous skiing sites.  I have encountered 
such a symbol in Europe and in New Zealand, but not in the USA.  (I have not 
had the pleasure of skiing in Canada and am thus unable to speak about ski 
areas in that country.)

The double black diamond would appear to be a unique symbol worthy of encoding, 
simply because the only valid typographical representation (in the USA) is two 
single black diamonds stacked one above the other and touching at the points.

Hope this helps,
   Jim

On 5/28/2015 2:04 PM, Shawn Steele wrote:
So is double black diamond a separate symbol?  Or just two of the black diamond?

And Blue-Black?

I’m drawing a blank on a specific bunny sign, in my experience those are 
usually just green.

Aren’t there a lot of cartography symbols for various systems that aren’t 
present in Unicode?

From: Unicode 
[mailto:unicode-boun...@unicode.orghttp://unicode-boun...@unicode.org] On 
Behalf Of Philippe Verdy
Sent: Thursday, May 28, 2015 12:47 PM
To: unicode Unicode Discussion
Subject: Bunny hill symbol, used in America for signaling ski pistes for 
novices

Is there a symbol that can represent the Bunny hill symbol used in North 
America and some other American territories with mountains, to 

Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread Doug Ewell
Note: Everything below is my personal opinion and does not represent any
official Unicode Consortium or UTC position.

William_J_G Overington wjgo underscore 10009 at btinternet dot com
wrote:

 Historically, Unicode was not meant to be the means by which brand
 new ideas are run up the proverbial flagpole to see if they will gain
 traction.

 History is interesting and can be a good guide, yet many things that
 are an accepted part of Unicode today started as new ideas that gained
 traction and became implemented. So history should not be allowed to
 be a reason to restrict progress.

I used historically to distinguish between the pre- and post-Emoji
Revolution eras. There have clearly been changes recently, but there is
still at least a minimal expectation that proposed characters will
fulfill a demonstrated need.

I'm not seeing any truly novel, untested ideas in the list below that
Unicode implemented purely on speculation.

 For example, there was the extension from 1 plane to 17 planes.

That was an architectural extension, brought about by the realization
that 64K code points wasn't enough for even the original scope. There's
no comparison.

 There was the introduction of emoji support.

Emoji proponents would argue that emoji support began in 1.0 with the
inclusion of various dingbats. But even emoji are arguably characters
in some sense. They aren't a mini-language used to define images pixel
by pixel.

 There was the introduction of the policy of colour sometimes being a
 recorded property rather than having just the original monochrome
 recording policy.

There isn't any such policy. There is a variation selector to suggest
that the rendering engine show certain characters in emoji style
instead of text style, and there are characters with colors in their
names, but there is no policy that specific colors are recorded as
part of the encoding. YELLOW HEART could conformantly appear in any
color.

 There has been the change of encoding policy that facilitated the
 introduction of the Indian Rupee character into Unicode and ISO/IEC
 10646 far more quickly than had been thought possible, so that the
 encoding was ready for use when needed.

That's not a change to what types of things get encoded. It's a
procedural change, one which I would agree has been applied with
increasing creativity.

 There has been the recent encoding policy change regarding encoding of
 pure electronic use items taking place without (extensive prior use
 using a Private Use Area encoding), such as the encoding of the
 UNICORN FACE.

This is probably your best analogy. People like Asmus have addressed it,
saying it's not reasonable to expect users to adopt PUA solutions and
wait for them to catch on.

 There is the recent change to the deprecation status of most of the
 tag characters and the acceptance of the base character followed by
 tag characters technique so as to allow the specifying of a larger
 collection of particular flags.

There must have been a great wailing and gnashing of teeth over that
decision. So many statements were made over the years about the basic
evilness of tag characters.

But the concept of representing flags was already agreed upon as a
compatibility measure, and the Regional Indicator Symbols solution was
a compromise that allowed expansion beyond the 10 flags that Japanese
telcos chose to include. RIS were an architectural decision. The tag
solution (to be fully outlined in a future PRI) was another
architectural decision. Neither (I believe) is analogous to a scope
decision to start encoding different types of non-character things as if
they were characters, and as I have said before, assigning a glyph to a
thing that isn't a character doesn't make it one.

--
Doug Ewell | http://ewellic.org | Thornton, CO 




Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread David Starner
I would say that a system would conform with Unicode in having yellow heart
red (in a non-monochrome font) as well as if it made it a cross. Either way
it's violating character identity. I'd say that being monochromatic is now
like being monospaced; it's suboptimal for a Unicode implementation, but
hardly something Unicode can condemn as nonconformant.

On 4:25pm, Sat, May 30, 2015 Doug Ewell d...@ewellic.org wrote:

 Note: Everything below is my personal opinion and does not represent any
 official Unicode Consortium or UTC position.

 William_J_G Overington wjgo underscore 10009 at btinternet dot com
 wrote:

  Historically, Unicode was not meant to be the means by which brand
  new ideas are run up the proverbial flagpole to see if they will gain
  traction.
 
  History is interesting and can be a good guide, yet many things that
  are an accepted part of Unicode today started as new ideas that gained
  traction and became implemented. So history should not be allowed to
  be a reason to restrict progress.

 I used historically to distinguish between the pre- and post-Emoji
 Revolution eras. There have clearly been changes recently, but there is
 still at least a minimal expectation that proposed characters will
 fulfill a demonstrated need.

 I'm not seeing any truly novel, untested ideas in the list below that
 Unicode implemented purely on speculation.

  For example, there was the extension from 1 plane to 17 planes.

 That was an architectural extension, brought about by the realization
 that 64K code points wasn't enough for even the original scope. There's
 no comparison.

  There was the introduction of emoji support.

 Emoji proponents would argue that emoji support began in 1.0 with the
 inclusion of various dingbats. But even emoji are arguably characters
 in some sense. They aren't a mini-language used to define images pixel
 by pixel.

  There was the introduction of the policy of colour sometimes being a
  recorded property rather than having just the original monochrome
  recording policy.

 There isn't any such policy. There is a variation selector to suggest
 that the rendering engine show certain characters in emoji style
 instead of text style, and there are characters with colors in their
 names, but there is no policy that specific colors are recorded as
 part of the encoding. YELLOW HEART could conformantly appear in any
 color.

  There has been the change of encoding policy that facilitated the
  introduction of the Indian Rupee character into Unicode and ISO/IEC
  10646 far more quickly than had been thought possible, so that the
  encoding was ready for use when needed.

 That's not a change to what types of things get encoded. It's a
 procedural change, one which I would agree has been applied with
 increasing creativity.

  There has been the recent encoding policy change regarding encoding of
  pure electronic use items taking place without (extensive prior use
  using a Private Use Area encoding), such as the encoding of the
  UNICORN FACE.

 This is probably your best analogy. People like Asmus have addressed it,
 saying it's not reasonable to expect users to adopt PUA solutions and
 wait for them to catch on.

  There is the recent change to the deprecation status of most of the
  tag characters and the acceptance of the base character followed by
  tag characters technique so as to allow the specifying of a larger
  collection of particular flags.

 There must have been a great wailing and gnashing of teeth over that
 decision. So many statements were made over the years about the basic
 evilness of tag characters.

 But the concept of representing flags was already agreed upon as a
 compatibility measure, and the Regional Indicator Symbols solution was
 a compromise that allowed expansion beyond the 10 flags that Japanese
 telcos chose to include. RIS were an architectural decision. The tag
 solution (to be fully outlined in a future PRI) was another
 architectural decision. Neither (I believe) is analogous to a scope
 decision to start encoding different types of non-character things as if
 they were characters, and as I have said before, assigning a glyph to a
 thing that isn't a character doesn't make it one.

 --
 Doug Ewell | http://ewellic.org | Thornton, CO 





Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread William_J_G Overington
Responding to Leo Broukhis:

 A more common occurrence is the need to include a non-standard character in a 
 text message, be it a ski piste symbol or an obscure CJK ideogram. Have you 
 thought of  embedding TrueType in Unicode? 

Not congruently so, yet, in effect, yes, as I have considered including 
individual OpenType-compatible glyphs in a base character followed by tag 
characters format. OpenType is a development from TrueType that can achieve 
more than can TrueType on its own.

There is a little about this in the last two paragraphs of the following post.

http://www.unicode.org/mail-arch/unicode-ml/y2015-m05/0218.html

There would need to be a few additions to make if work effectively: for 
example, a value for each of advance width, ascent maximum, descent maximum and 
fontunits per em.

William Overington

30 May 2015








Re: Some questions about Unicode's CJK Unified Ideograph

2015-05-30 Thread Andrew West
On 30 May 2015 at 02:50, Ken Whistler kenwhist...@att.net wrote:

 1. I have seen a chinese character ⿰言亜 from a Vietnamese dictionary NHAT
 DUNG THUONG DAM DICTIONARY

 Extension F is harder to track down, because it has not yet been
 approved by the UTC, and comes in two pieces, with different
 progression so far in the ISO committee. Perhaps somebody on this list
 who has better access to the relevant documents can let you
 know whether ⿰言亜 can be found in those sets.

It's not in my lists of F1 and F2 characters.

 2. Is combined characters like U+20DD intended to work with all different
 type of characters, or is it some problem related to implementation ? as I
 when i write ゆ⃝ (Japanese Hiragana Letter Yu + Combining Enclosing Circle)
 appear to be separate on most font I use, but if I change the Hiragana Yu
 into a conventional = sign or some latin character, most fonts are at
least
 somehow able to put them together. Or, is there any better/alternative
 representation in unicode that can show japanese hiragana yu in a circle?

 Combining enclosing marks in principle could work with most characters,
 but in practice most arbitrary combinations do not work very well,
 because they would require very complicated font support.

It's not that complicated, but I think most fonts don't support arbitrary
combinations with combining enclosing circle because there is little or no
demand for them.  BabelStone Han displays Japanese Hiragana Letter Yu +
Combining Enclosing Circle quite well, but on the other hand it does not
work so well with CJK ideographs, and fails with Latin letters and
punctuation.


​

 4.In CJK Symbols and Punctuation, Proper name mark and Book name mark are
 not included. While there are charactera like U+2584, U+FE33, U+FE4F, and
 U+FE34 in unicode that is more or less a representation for the two
symbol,
 they do not appear below or on the left of typed characters when text flow
 is horizontal/vertical, and instead, they occupy their own space which
make
 them having little use in daily life, and while the proper name mark and
 book name mark can represented by text editing softwares and css but those
 representation are not ideal and they do match Criteria for Encoding
 Symbols. Is it possible to make a new unicode symbol, or change some
 current symbol into one that could appear in suitable place of other
 characters when typed? And a property of the symbol is that when used in
 case like 美國紐約 which 美國 and 紐約 are two different proper name (place name),
 so an underline should go below them without any separation between the
 character 美and國 or 紐and約 (when text are written horizontally), but at the
 same time the underline should not be linked between 國 and 紐 as 國 is the
end
 of first place name while 紐 is the start of the other.


 What you are talking about is, indeed, best handled by text styling
 attributes,rather than by individual character encoding.

I agree.  However, if you really do want to represent underlining of proper
names at the character encoding level, then you would have to do something
like put U+0332 Combining Low Line after each character to be underlined,
and select a font that supports Combining Low Line with CJK ideographs.
BabelStone Han supports this low-level method of underlining CJK
ideographs, but if you want a space in the underlining between 美國 and 紐約
you would have to insert a very thin space (U+200A Hair Space in this
example) between the characters.


​

Andrew


Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread William_J_G Overington
Responding to Doug Ewell:

 I think this cuts to the heart of what people have been trying to say all 
 along.

 Historically, Unicode was not meant to be the means by which brand new ideas 
 are run up the proverbial flagpole to see if they will gain traction.

History is interesting and can be a good guide, yet many things that are an 
accepted part of Unicode today started as new ideas that gained traction and 
became implemented. So history should not be allowed to be a reason to restrict 
progress.

For example, there was the extension from 1 plane to 17 planes.

There was the introduction of emoji support.

There was the introduction of the policy of colour sometimes being a recorded 
property rather than having just the original monochrome recording policy.

There has been the change of encoding policy that facilitated the introduction 
of the Indian Rupee character into Unicode and ISO/IEC 10646 far more quickly 
than had been thought possible, so that the encoding was ready for use when 
needed.

There has been the recent encoding policy change regarding encoding of pure 
electronic use items taking place without (extensive prior use using a Private 
Use Area encoding), such as the encoding of the UNICORN FACE.

There is the recent change to the deprecation status of most of the tag 
characters and the acceptance of the base character followed by tag characters 
technique so as to allow the specifying of a larger collection of particular 
flags.



The two questions that I asked in my response to a post by Mark E. Shoulson are 
relevant here.

Suppose that a plain text file is to include just one non-standard emoji 
graphic. How would that be done otherwise than by the format that I am 
suggesting?

What if there were three such non-standard emoji graphics needed in the plain 
text file, the second graphic being used twice. How would that be done 
otherwise than by the format that I am suggesting?

William Overington

30 May 2015





Re: Tag characters and in-line graphics (from Tag characters)

2015-05-30 Thread John
Hmm, these once entities of which you speak, do they require javascript? 
Because I'm not sure what we are looking for here is static documents requiring 
a full programming language.




But let's say for a moment that html5 can, or could do the job here. Then to 
make the dream come true that you could just cut and paste text that happened 
to contain a custom character to somewhere else, and nothing untoward would 
happen, would mean that everything in the computing universe should allow full 
blown html. So every Java Swing component, every Apple gui component, every 
.NET component, every windows component, every browser, every Android and IOS 
component would allow text entry of HTML entities. OK, so let's say everyone 
agrees with this course of action, now the universal text format is HTML.




But in this new world where anywhere that previously you could input text, you 
can now input full blown html, does that actually make sense? Does it make 
sense that you can for example, put full blown HTML inside a H1 tag in html 
itself? That's a lot of recursion going on there. Or in a MS-Excel cell? Or 
interspersed in some otherwise fairly regular text in a Word document?




I suppose someone could define a strict limited subset of HTML to be that 
subset that makes sense in ALL textual situations. That subset would be 
something like just defining things that act like characters, and not like a 
full blown rendering engine. But who would define that subset? Not the HTML 
groups, because their mandate is to define full blown rendering engines. It 
would be more likely to be something like the unicode group.




And also, in this brave new world where HTML5 is the new standard text format, 
what would the binary format of it be? I mean, if I have the string of unicode 
characters IMG would that be HTML5 image definition that should be rendered as 
such? Or would it be text that happens to contain greater than symbol, I, M and 
G? It would have to be the former I guess, and thereby there would no longer be 
a unicode symbol for the mathematical greater than symbol. Rather there would 
be a unicode symbol for opening a HTML tag, and the text code for greater than 
would be gt; Never again would a computer store  to mean greater than. Do we 
want HTML to be so pervasive? Not sure it deserves that.




And from a programmers point of view, he wants to be able to iterate over an 
array of characters and treat each one the same way, regardless if it is a 
custom character or not. Without that kind of programmatic abstraction, the 
whole thing can never gain traction. I don't think fully blown HTML embedded in 
your text can fulfill that. A very strictly defined subset, possibly could. 
Sure HTML5 can RENDER stuff adquately, if the only aim of the game is provide a 
correct rendering. But to be able to actually treat particular images embedded 
as characters, and have some programming library see that abstraction 
consistently, I'm not sure I'm convinced that is possible. Not without nailing 
down exactly what html elements in what particular circumstances constitute a 
character.




I guess in summary, yes we have the technology already to render anything. But 
I don't think the whole standards framework does anything to allow the 
computing universe to actually exchange custom characters as if they were just 
any other text. Someone would actually have to  work on a standard to do that, 
not just point to html5.








On Saturday, 30 May 2015 at 5:08 am, Philippe Verdy verd...@wanadoo.fr, wrote:


2015-05-29 4:37 GMT+02:00 John idou...@gmail.com:

Today the world goes very well with HTML(5) which is now the bext markup 
language for document (including for inserting embedded images that don’t 
require any external request”

If I had a large document that reused a particular character thousands of 
times, would this HTML markup require embedding that character thousands of 
times, or could I define the character once at the beginning of the sequence, 
and then refer back to it in a space efficient way?





HTML(5) allows defining *once* entities for images that can then be reused 
thousands of times without repeting their definition. You can do this as well 
with CSS styles, just define a class for a small element. This element may 
still be an image, but the semantic is carried by the class you assign to it. 
You are not required to provide an external source URL for that image if the 
CSS style provides the content.




You may also use PUAs for the same purpose (however I have not seen how CSS 
allows to style individual characters in text elements as these characters are 
not elements, and there's no defined selector for pseudo-elements matching a 
single character). PUAs are perfectly usable in the situation where you have 
embedded a custom font in your document for assigning glyphs to characters (you 
can still do that, but I would avoid TrueType/OpenType for this purpose, but 
would use the SVG