Re: Why so much emoji nonsense?

2018-02-14 Thread James Kass via Unicode
Martin J. Dürst wrote:

> The original Japanese cell phone carrier emoji where defined in the
> unassigned area of Shift_JIS, not Unicode.

Thank you (and another list member) for reminding that it was
originally hacked SJIS rather than proper PUA Unicode.



Re: Why so much emoji nonsense?

2018-02-14 Thread Martin J. Dürst via Unicode

On 2018/02/15 10:49, James Kass via Unicode wrote:


Yes, except that Unicode "supported" all manner of things being
interchanged by setting aside a range of code points for private use.
Which enabled certain cell phone companies to save some bandwidth by
assigning various popular in-line graphics to PUA code points.


The original Japanese cell phone carrier emoji where defined in the 
unassigned area of Shift_JIS, not Unicode. Shift_JIS doesn't have an 
official private area, but using the empty area by companies had already 
happened for Kanji (by IBM, NEC, Microsoft). Also, there was some 
transcoding software initially that mapped some of the emoji to areas in 
Unicode besides the PUA, based on very simplistic conversion.



The
"problem" was that these phone companies failed to get together on
those PUA code point assignments, so they could not exchange their
icons in a standard fashion between competing phone systems.  [Image
of the world's smallest violin playing.]


Emoji were originally a competitive device. As an example, NTT Docomo 
allowed the ticket service PIA to have an emoji for their service, most 
probably in order to entice them to sign up to participate in the 
original I-mode (first case of Web on mobile phones) service. Of course, 
that specific emoji (or was it several) wasn't encoded in Unicode 
because of trademark issues.


Regards,Martin.


Re: Why so much emoji nonsense?

2018-02-14 Thread James Kass via Unicode
On Wed, Feb 14, 2018 at 5:14 PM, David Starner  wrote:

> They were units of things being interchanged in formats of MIME types
> starting with text/ . From the beginning, Unicode has supported all the
> cruft that's being interchanged in formats of MIME types starting with
> text/.

Yes, except that Unicode "supported" all manner of things being
interchanged by setting aside a range of code points for private use.
Which enabled certain cell phone companies to save some bandwidth by
assigning various popular in-line graphics to PUA code points.  The
"problem" was that these phone companies failed to get together on
those PUA code point assignments, so they could not exchange their
icons in a standard fashion between competing phone systems.  [Image
of the world's smallest violin playing.]

I've personally exchanged text data with others using the PUA for both
Klingon and Ewellic.  [winks]


Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 2:35 PM James Kass via Unicode 
wrote:

> David Starner wrote,
>
> > They were characters being interchanged as text
> > in current use.
>
> They were in-line graphics being interchanged as though they were
> text.  And they still are.  And we still disagree.
>

They were units of things being interchanged in formats of MIME types
starting with text/ . From the beginning, Unicode has supported all the
cruft that's being interchanged in formats of MIME types starting with
text/.


Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Asmus Freytag via Unicode

  
  
On 2/14/2018 10:37 AM, Shriramana
  Sharma via Unicode wrote:


  

  
On 14-Feb-2018 22:45, "Alastair
  Houghton" 
  wrote:
  


  

  

I’d hope that Mark Davis has “UNICODE” on his car. 
However, I’m not sure how relevant it really is to this
mailing list.
  

  



You're right. My apologies. It *is* somewhat OT
  to the actual purpose of this list. But I figured if anyone
  knew the answer to my question they'd be here.
  

There are some who would claim that the 'unicode'
list can't be said to have a "purpose" and should best be
avoided altogether :).
However, while there are posts and
discussions that really don't belong (or are bothersome, or otherwise
a misuse of the list), this thread, if it doesn't continue for
another 500 posts, would not seem to qualify...
A./

  



Re: Why so much emoji nonsense?

2018-02-14 Thread Ken Whistler via Unicode



On 2/14/2018 12:49 PM, Philippe Verdy via Unicode wrote:



RCLLTHTWHNLPHBTSWRFRSTNVNTDPPLWRTTXTLKTHS !




[ ... lots to say about the history of writing ... ]

And the use (or abuse) of emojis is returning us to the prehistory 
when people draw animals on walls of caverns: this was a very slow 
communication, not giving a rich semantic, full of ambiguities about 
what is really meant, and in fact a severe loss of knowledge where 
people will not communicate easily and rapidly.


=-O Perhaps Philippe was missing my point about how and why emoji are 
actually used.


--Ken



Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Asmus Freytag via Unicode

  
  
On 2/14/2018 8:14 AM, Shriramana Sharma
  via Unicode wrote:


  Given that in the US vanity vehicle registrations with arbitrary
alphanumeric sequences upto 7 characters are permitted (I am correct I
hope?), I wonder who (here?) owns the UNICODE registration?



Please note that the rules for this are set
on the state level, not to speak of territories. So, assuming
the local regulations do not forbid the word "UNICODE" for some
reason, there could be some 50 plus cars registered with that
license plate.
The actual number is anybody's guess,
although many people think they know who owned the first one of
these.
I've always thought it might be a neat idea
to get matching plates covering the entire series of UTFs, with
each car chosen to reflect some aspect of the encoding, e.g. "UTF-8"
smaller than "UTF-32", and "UTF-7" some oddball model...
A./
  
  



Re: Why so much emoji nonsense?

2018-02-14 Thread James Kass via Unicode
David Starner wrote,

> They were characters being interchanged as text
> in current use.

They were in-line graphics being interchanged as though they were
text.  And they still are.  And we still disagree.


Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 11:16 AM James Kass via Unicode 
wrote:

> That's one way of looking at it.  Another way would be that the emoji
> were definitely outside the scope of the Unicode project as encoding
> them violated Unicode's initial encoding principles.
>

They were characters being interchanged as text in current use. They are
more inside the scope than many of the line-drawing characters for 8-bit
computers that have been there since day one, and analogous to many of the
dingbats that have also been there since day one.


Re: Why so much emoji nonsense?

2018-02-14 Thread Philippe Verdy via Unicode
2018-02-14 20:50 GMT+01:00 Ken Whistler via Unicode :

>
> On 2/14/2018 12:53 AM, Erik Pedersen via Unicode wrote:
>
>> Unlike text composed of the world’s traditional alphabetic, syllabic,
>> abugida or CJK characters, emoji convey no utilitarian and unambiguous
>> information content.
>>
>
> I think this represents a misunderstanding of the function of emoji in
> written communication, as well as a rather narrow concept of how writing
> systems work and why they have evolved.
>
> RECALLTHATWHENALPHABETSWEREFIRSTINVENTEDPEOPLEWROTETEXTLIKETHIS
>

RCLLTHTWHNLPHBTSWRFRSTNVNTDPPLWRTTXTLKTHS !

The concept of vowels as distinctive letters came later, even the letter A
was initially a representation of a glottal stop consonnant, sometimes
mute, only written to indicate a word that did not start by a consonnant in
their first syllable, letter. This has survived today in abjads and
abugidas where vowels became optional diacritics, but that evolved as plain
diacritics in Indic abugidas.

The situation is even more complex because clusters of consonnants were
also represented in early vowel-less alphabets to represent full syllables
(this has formed the base of todays syllabaries when only some glyph
variants of the base consonnant was introduced to distinguish their
vocalization; Indic abugidas with their complex clusters where vowel
diacritic create contextual variant forms of the base consonnant is also a
remnant of this old age): the separation of phonetic consonnants came only
later. Today's alphabets have a long history of evolution and adaptation to
new needs for more precise communication and easier distinctions in
languages that have also evolved; some new letters or diacritics were
progressively abandonned, and but as the historic alphabets have persisted,
then came the concept of digrams to represent a single sound by multiple
letters, instead of inventing a new letter or diacritic, because the
language in which these digrams were used almost never needed the
phonetic letter
pairs or their phonology (or such letter pair was too rarely needed that
such use of digrams did not make the text undecipherable given the context
of use). Over time the alphabets became less and less representative of the
phonology (which evolved more rapidly than orthographies for texts that
languages wanted to preserve, or because various local phonetic variants of
the languages could stil lremain unified by keeping mute letters or letters
representing sounds realized differently across regions).

The invention of bicameral scripts later allowed easier distinction or
reading when contextual forms could be used to emphasize the structure
without necessarily using punctuation signs (the lowercase letters came
from handwriting, because the initial engraved letters were to difficult to
trace with a plum or pencil: letters were joined). Punctuation signs came
later which could have deprecated the use of bicameral orthography, but
languages have constinued to borrow terms from other languages, and the
bicameral distinction became important to preserve. The invention of
printing also produced artefacts in the orthography by the adoption of many
abbreviation signs (because the paper or parchemins were expensive), and
forced some simplifications of the handwritten style with a plum or pencil.

Our recent age of computers (or even before the mechanical typewritters)
have also dramatically simplified the alphabets because the character set
was severely reduced by limitations of the initial technologies (this could
have potentially killed all the abjads, abugidas, syllabaries or
ideo-phonographic scripts during the 20th century, if there was not a
popular resistance to preserve the culture of the initial texts written by
humans, and notably the precious religious books): it is still difficult
today to preserve many of the non-alphabetic scripts, and there's also
difficulties to preserve the meaning diacritics in abjads and abugidas and
even in alphabets, as well as bicameral distinctions. Finally the
preservation of letters inherited from etymology to allow readers to infer
semantics from words is difficult: this is the wellknown problem of
orthographic reforms that tend to remove mute letters, remove some phonetic
distinctions in letters and infer more and more the semantic from the
context: we are in fact slowly returning to the old age of:

RCLLTHTWHNLPHBTSWRFRSTNVNTDPPLWRTTXTLKTHS !

And the use (or abuse) of emojis is returning us to the prehistory when
people draw animals on walls of caverns: this was a very slow
communication, not giving a rich semantic, full of ambiguities about what
is really meant, and in fact a severe loss of knowledge where people will
not communicate easily and rapidly. The Emojis are a threat to the
inherited culture, knowledge and science in general: we won't understand
what was meant, and will loose our language to a point where it will be
very unproductive and will generate more 

Re: Why so much emoji nonsense?

2018-02-14 Thread Ken Whistler via Unicode


On 2/14/2018 12:53 AM, Erik Pedersen via Unicode wrote:

Unlike text composed of the world’s traditional alphabetic, syllabic, abugida 
or CJK characters, emoji convey no utilitarian and unambiguous information 
content.


I think this represents a misunderstanding of the function of emoji in 
written communication, as well as a rather narrow concept of how writing 
systems work and why they have evolved.


RECALLTHATWHENALPHABETSWEREFIRSTINVENTEDPEOPLEWROTETEXTLIKETHIS

The invention and development of word spacing, punctuation, and casing, 
among other elements of typography, represent the addition of meta-level 
information to written communication that assists in legibility, helps 
identify lexical and syntactic units, conveys prosody, and other 
information that is not well conveyed by simply setting down letters of 
an alphabet one right after the other.


Emoticons were invented, in large part, to fill another major hole in 
written communication -- the need to convey emotional state and 
affective attitudes towards the text. This is the kind of information 
that face-to-face communication has a huge and evolutionarily deep 
bandwidth for, but which written communication typically fails miserably 
at. Just adding a little happy face :-) or sad face :-( to a short email 
manages to convey some affect much more easily and effectively than 
adding on entire paragraphs trying to explain how one feels about what 
was just said. Novelists have the skill to do that in text without using 
little pictographic icons, but most of us are not professional writers! 
Note that emoticons were invented almost as soon as people started 
communicating in digital mediums like email -- so long predate anything 
Unicode came up with.


Other kinds of emoji that we've been adding recently may have a somewhat 
more uncertain trajectory, but the ones that seem to be most successful 
are precisely those which manage to connect emotionally with people, and 
which assist them in conveying how they *feel* about what they are writing.


So I would suggest that people not just dismiss (or diss) this ongoing 
phenomenon. Emoji are widely used for many good reasons. And of course, 
like any other aspect of writing, get mis-used in various ways, as well. 
But you can be sure that their impact on the evolution of world writing 
is here to stay and will be the topic of serious scholastic papers by 
scholars of writing for decades to come. ;-)


--Ken




Re: Why so much emoji nonsense?

2018-02-14 Thread James Kass via Unicode
Alastair Houghton wrote,

> ...but they were definitely within the scope of the
> Unicode project as encoding them provides interoperability.

That's one way of looking at it.  Another way would be that the emoji
were definitely outside the scope of the Unicode project as encoding
them violated Unicode's initial encoding principles.

The opposition was strong, but resistance was futile.  Anyone
interested in the arguments made at the time should check the Unicode
public list archives in late 2008 and early 2009.  Here's the link for
January 2009:
http://www.unicode.org/mail-arch/unicode-ml/y2009-m01/index.html

Surprisingly, though, I have found at least one roundabout use for the
emoji.  When reading message boards and comment pages I've found that
it's quite simple to skip any messages which are peppered with emoji
without missing anything of substance.

As far as interoperability goes, there's scads of emoji in the wild
which aren't currently in Unicode.  Every kind of hobby or interest
seems to generate emoji specific to that area of interest.


Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Shriramana Sharma via Unicode
On 14-Feb-2018 22:45, "Alastair Houghton" 
wrote:


I’d hope that Mark Davis has “UNICODE” on his car.  However, I’m not sure
how relevant it really is to this mailing list.


You're right. My apologies. It *is* somewhat OT to the actual purpose of
this list. But I figured if anyone knew the answer to my question they'd be
here.


Re: Why so much emoji nonsense?

2018-02-14 Thread Alastair Houghton via Unicode
On 14 Feb 2018, at 13:25, Shriramana Sharma via Unicode  
wrote:
> 
> From a mail which I had sent to two other Unicode contributors just a
> few days ago:
> 
> Frankly I agree that this whole emoji thing is a Pandora box. It
> should have been restricted to emoticons to express facial or physical
> gestures which are insufficiently representable by words. When it
> starts representing objects like  then it becomes a problem as to
> where to draw the line.

A lot of the emoji were encoded because they were in use on Japanese mobile 
phones.  A fair proportion of those may very well not meet the selection 
factors (see ) required for new 
emoji, but they were definitely within the scope of the Unicode project as 
encoding them provides interoperability.

As for newer emoji, whether they are encoded or not is up to the UTC, and as I 
say, they apply (or are supposed to apply) the criteria on the “Submitting 
Emoji Proposals” page.  There is certainly an argument that the encoding of new 
emoji should be discouraged in favour of functionality at higher layers (e.g. 
 tags in HTML), but, honestly, I think that ship has probably sailed.  
Similarly there are, I think, good reasons to object to the skin tone and 
gender modifiers, but we’ve already opened that can of worms and so will now 
have to put up with demands for red hair (or quite probably, freckles, 
monobrows, different hats, hair, beard and moustache styles and so on).

Kind regards,

Alastair.

--
http://alastairs-place.net




Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Alastair Houghton via Unicode
On 14 Feb 2018, at 16:29, Shriramana Sharma via Unicode  
wrote:
> 
> Sorry but "UNICODE" does fit within those rules doesn't it?

Yes.  Stephane has misunderstood.  (Shriramana meant the literal text 
“UNICODE”, which is indeed composed of letters A-Z and meets the definition 
quoted.)

I’d hope that Mark Davis has “UNICODE” on his car.  However, I’m not sure how 
relevant it really is to this mailing list.

Kind regards,

Alastair.

--
http://alastairs-place.net




Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Stephane Bortzmeyer via Unicode
On Wed, Feb 14, 2018 at 09:59:53PM +0530,
 Shriramana Sharma  wrote 
 a message of 54 lines which said:

> Sorry but "UNICODE" does fit within those rules doesn't it?

I doubt that the Departement of Motor Vehicles will accept "but it is
in category Ll" as a good reason :-)


Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Andrew West via Unicode
You can use ♥⭐➕ in California. Someone has U+1F913 邏 (
https://www.instagram.com/p/BVYtIHensDu/)

Andrew


On 14 February 2018 at 16:24, Stephane Bortzmeyer via Unicode <
unicode@unicode.org> wrote:

> On Wed, Feb 14, 2018 at 09:44:06PM +0530,
>  Shriramana Sharma via Unicode  wrote
>  a message of 6 lines which said:
>
> > Given that in the US vanity vehicle registrations with arbitrary
> > alphanumeric sequences upto 7 characters are permitted (I am correct
> > I hope?), I wonder who (here?) owns the UNICODE registration?
>
> Won't work in New York, unfortunately
>
> https://dmv.ny.gov/learn-about-personalized-plates
>
> "A character is a letter (A-Z), number (0-9) or space. Each space
> counts as one character."
>
>


Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Shriramana Sharma via Unicode
Sorry but "UNICODE" does fit within those rules doesn't it?

On 14-Feb-2018 21:54, "Stephane Bortzmeyer"  wrote:

On Wed, Feb 14, 2018 at 09:44:06PM +0530,
 Shriramana Sharma via Unicode  wrote
 a message of 6 lines which said:

> Given that in the US vanity vehicle registrations with arbitrary
> alphanumeric sequences upto 7 characters are permitted (I am correct
> I hope?), I wonder who (here?) owns the UNICODE registration?

Won't work in New York, unfortunately

https://dmv.ny.gov/learn-about-personalized-plates

"A character is a letter (A-Z), number (0-9) or space. Each space
counts as one character."


Re: UNICODE vehicle vanity registration?

2018-02-14 Thread Stephane Bortzmeyer via Unicode
On Wed, Feb 14, 2018 at 09:44:06PM +0530,
 Shriramana Sharma via Unicode  wrote 
 a message of 6 lines which said:

> Given that in the US vanity vehicle registrations with arbitrary
> alphanumeric sequences upto 7 characters are permitted (I am correct
> I hope?), I wonder who (here?) owns the UNICODE registration?

Won't work in New York, unfortunately

https://dmv.ny.gov/learn-about-personalized-plates

"A character is a letter (A-Z), number (0-9) or space. Each space
counts as one character."



UNICODE vehicle vanity registration?

2018-02-14 Thread Shriramana Sharma via Unicode
Given that in the US vanity vehicle registrations with arbitrary
alphanumeric sequences upto 7 characters are permitted (I am correct I
hope?), I wonder who (here?) owns the UNICODE registration?

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: Why so much emoji nonsense?

2018-02-14 Thread Shriramana Sharma via Unicode
>From a mail which I had sent to two other Unicode contributors just a
few days ago:

Frankly I agree that this whole emoji thing is a Pandora box. It
should have been restricted to emoticons to express facial or physical
gestures which are insufficiently representable by words. When it
starts representing objects like  then it becomes a problem as to
where to draw the line.

I mean I can see the argument for  representing gratitude, but which
fruits are valid and which not... And which food items are valid and
which not, else you would get proposals for idli and dosa emojis as
well! (Those who don't know what those are see
https://en.wikipedia.org/wiki/Idli and
https://en.wikipedia.org/wiki/Dosa)

It seems to me that graphical items previously rejected as such are
now being encoded. I mean, if other things like bat ball etc then "why
not this one" cannot be refused, but the question is whether encoding
bat ball in the first place was keeping with the original intention or
spirit of Unicode.

Anyhow, what is done is done and the Pandora's box is now open and I
don't envy the ESC their job. I don't know, maybe sometimes they may
just feel like hitting "ESC" too!

--
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ူ၆ိျိါအူိ၆ါး



Re: Why so much emoji nonsense?

2018-02-14 Thread Konstantin Ritt via Unicode
2018-02-14 12:18 GMT+03:00 David Starner via Unicode :

> Even if mistakes were made, they were carved into stone, and going back is
> not an option.
>

Sure. However that doesn't mean Unicode should keep adding more and more
emoji nonsense.

A billion of cat faces, pile of poo, * skin tone
Santa/vampire/superwoman/levitating man, keycaps and clocks - are they
really that important for the Standard to be encoded separately?! Well,
that was a rhetorical question...


Regards,
Konstantin


Re: Why so much emoji nonsense?

2018-02-14 Thread David Starner via Unicode
On Wed, Feb 14, 2018 at 12:55 AM Erik Pedersen via Unicode <
unicode@unicode.org> wrote:

> Dear Unicode Digest list members,
>
> Emoji, in my opinion, are almost entirely outside the scope of the Unicode
> project. Unlike text composed of the world’s traditional alphabetic,
> syllabic, abugida or CJK characters, emoji convey no utilitarian and
> unambiguous information content. Let us, therefore, abandon Emoji support
> in Unicode as a project that failed. If corporations want to maintain
> support for Emoji, let’s require them to use only the Private Use Area and,
> henceforth, confine Unicode expansion to attested characters from so far
> unsupported scripts.
>

Because ' has so much unambiguous information content. Or even just c.
(What's the phonetic value of that letter? Okay, I'll be "easy" on you;
what's the phonetic value of that letter in English? What about e?)

Also, who are the full members of Unicode?
http://www.unicode.org/consortium/members.html says Google, Apple, Huawei,
Facebook, Microsoft, etc. By show of hands, who wants a substantial part of
the user's data to become incompatible? I think they just voted this down.

Even ignoring that, this road has been crossed. Unicode will not tear out
anything, but if they could, people could probably survive Cuneiform or
Linear A going by the wayside. A not insubstantial part of the Unicode data
in the world includes emoji, and removing it would break everything. Like
many standards before that were radical changes, a new Unicode standard
without emoji would be dead in the water, and someone else would create a
competing back-compatible character standard and everyone would forget
about Unicode® and start using The One CCS®. It's like demanding that C use
bounds checking on its arrays, or that "island" go back to being spelled
"iland" now that we recognize it's not related to "isle". Even if mistakes
were made, they were carved into stone, and going back is not an option.


Why so much emoji nonsense?

2018-02-14 Thread Erik Pedersen via Unicode
Dear Unicode Digest list members,

Emoji, in my opinion, are almost entirely outside the scope of the Unicode 
project. Unlike text composed of the world’s traditional alphabetic, syllabic, 
abugida or CJK characters, emoji convey no utilitarian and unambiguous 
information content. Let us, therefore, abandon Emoji support in Unicode as a 
project that failed. If corporations want to maintain support for Emoji, let’s 
require them to use only the Private Use Area and, henceforth, confine Unicode 
expansion to attested characters from so far unsupported scripts.

Kind regards,
Erik Bjørn Pedersen — Victoria, B.C., Canada