Re: Emoji anomaly

2017-10-29 Thread Peter Edberg via Unicode
Hi André,

> U+1F321 ➜ U+1F32C do not have Emoji_Presentation property set. Time for me to 
> do some reading to determine why.


From https://www.unicode.org/emoji/charts-5.0/emoji-versions-sources.html 

you can see that these characters came into Unicode as a result of their being 
in the Webdings/Wingdings set, where they had a prior history of being 
non-emoji text characters. That is why they have Emoji_Presentation=No by 
default.

- Peter E


> On Oct 29, 2017, at 6:47 AM, Andre Schappo via Unicode  
> wrote:
> 
> Peter
> 
> Thank you very much for your informative response. I see that U+1F321 ➜ 
> U+1F32C do not have Emoji_Presentation property set. Time for me to do some 
> reading to determine why.
> 
> André
> 
>> On 29 Oct 2017, at 00:20, Peter Edberg > > wrote:
>> 
>> This is about characters U+1F327,U+1F326
>> 
>> The variation selector FE0F is *not* unnecessary in with these. Looking at
>> https://www.unicode.org/Public/emoji/5.0/emoji-data.txt 
>> 
>> those characters do *not* have the Emoji-Presentation property set, and they 
>> do have variation sequences defined.
>> 
>> From https://www.unicode.org/reports/tr51/#Emoji_Variation_Selector_Notes 
>> , such 
>> singleton emoji characters
>> “should have emoji presentation selectors on base characters with 
>> Emoji_Presentation=No whenever an emoji presentation is desired”
>> 
>> - Peter E
>> 
>>> On Oct 28, 2017, at 4:11 AM, Andre Schappo via Unicode >> > wrote:
>>> 
>>> 
>>> I am working on a Blog Article ( 
>>> https://schappo.blogspot.co.uk/2017/10/computer-science-internationalization.html
>>>  
>>> 
>>>  ) and do not currently have access to OSX High Sierra, I am using OSX 
>>> Sierra. I would appreciate some help from someone using OSX High Sierra.
>>> 
>>> Using Sierra's Chinese Simplified Input Method the Emoji ️ and ️ have an 
>>> unnecessary U+FE0F variation selector appended. The other Emoji I have 
>>> tested with Sierra's Chinese Simplified Input Method do not have the 
>>> variation selector appended. Could someone please check if the same happens 
>>> with High Sierra
>>> 
>>> Thank you
>>> 
>>> André
>>>   
>>> André Schappo
>>> https://schappo.blogspot.co.uk 
>>> https://twitter.com/andreschappo 
>>> https://weibo.com/andreschappo 
>>> https://groups.google.com/forum/#!forum/computer-science-curriculum-internationalization
>>>  
>>> 
>>> 
>>> 
> 
> 
> 
> 
> 



Re: Emoji anomaly

2017-10-28 Thread Peter Edberg via Unicode
This is about characters U+1F327,U+1F326

The variation selector FE0F is *not* unnecessary in with these. Looking at
https://www.unicode.org/Public/emoji/5.0/emoji-data.txt 

those characters do *not* have the Emoji-Presentation property set, and they do 
have variation sequences defined.

From https://www.unicode.org/reports/tr51/#Emoji_Variation_Selector_Notes 
, such 
singleton emoji characters
“should have emoji presentation selectors on base characters with 
Emoji_Presentation=No whenever an emoji presentation is desired”

- Peter E

> On Oct 28, 2017, at 4:11 AM, Andre Schappo via Unicode  
> wrote:
> 
> 
> I am working on a Blog Article ( 
> https://schappo.blogspot.co.uk/2017/10/computer-science-internationalization.html
>  
> 
>  ) and do not currently have access to OSX High Sierra, I am using OSX 
> Sierra. I would appreciate some help from someone using OSX High Sierra.
> 
> Using Sierra's Chinese Simplified Input Method the Emoji ️ and ️ have an 
> unnecessary U+FE0F variation selector appended. The other Emoji I have tested 
> with Sierra's Chinese Simplified Input Method do not have the variation 
> selector appended. Could someone please check if the same happens with High 
> Sierra
> 
> Thank you
> 
> André
>   
> André Schappo
> https://schappo.blogspot.co.uk 
> https://twitter.com/andreschappo 
> https://weibo.com/andreschappo 
> https://groups.google.com/forum/#!forum/computer-science-curriculum-internationalization
>  
> 
> 
> 
> 
> 
> 



Re: Should U+3248 ... U+324F be wide characters?

2017-08-18 Thread Peter Edberg via Unicode
Per UTS #51 (see http://www.unicode.org/reports/tr51/#Design_Guidelines 
):

"Current practice is for emoji to have a square aspect ratio, deriving from 
their origin in Japanese. For interoperability, it is recommended that this 
practice be continued with current and future emoji. They will typically have 
about the same vertical placement and advance width as CJK ideographs.'

- Peter E

> On Aug 18, 2017, at 1:48 PM, Philippe Verdy via Unicode  
> wrote:
> 
> I don't think that emojis are necessarily "square", they could be larger 
> (e.g. a train or a snake or an horizontal railway, or a group of several 
> peoples, or a cloud) or narrower (e.g. a candle).
> 
> Rendering them as square will make sense only in contexts where this makes 
> sense ** if possible** : monospaced fonts. But there are cases where a single 
> character cell would not be enough and multiple cells would be needed 
> (notably in text terminals, but as well in sinographic contexts uwing 
> multiple em-squares in a row).
> 
> The classification of widths in CJK if there to help determine how many cells 
> will be needed in two cases: narrow rectangular cells used in text terminals, 
> or square cells in classic sinographic typesetting (which is still not 
> mandatory because variable-width rendering is also possible, even if it is 
> less common, using more specific fonts for such artistic use or to correctly 
> render handwritten calligraphy). This classification of widths makes no sense 
> in Latin where it variable-width is still prefered and more common.
> 
> So there will be both variants for variable-width and "monospaced" 
> (cell-based) rendering of emojis, like they both exist for CJK and Latin: 
> Latin letters has a "narrow" width in sinographic square contexts only to 
> allow two letters side-by-side per square instead of centering them with wide 
> gaps or rendering them in widdened variants. Most Asian emojis from CJK 
> charsets will render in a single square cell, but others may still need two 
> square cells for better rendering (without having to use variable width that 
> would break the grid layout).
> 
> When rendering Latin words in CJK contexts, the alignment to the grid may 
> also be made only on spans of Latin letters (one or more words), by 
> recentering it in a row of as many cells that could fit: it would be even 
> more useful for Arabic sequences. This technic however would not fit very 
> well in classic "text terminals" where half-width Latin, Hebrew and Arabic 
> will still be preferable (or full-width for some Arabic letters with some 
> extenders, or some long Arabic ligatures).
> 
> 
> 
> 2017-08-18 14:21 GMT+02:00 Andre Schappo  >:
> 
>> On 18 Aug 2017, at 00:50, Philippe Verdy via Unicode > > wrote:
>> 
>> 
>> 2017-08-17 18:46 GMT+02:00 Asmus Freytag (c) via Unicode 
>> >:
>> On 8/17/2017 7:47 AM, Philippe Verdy wrote:
>>> 2017-08-17 16:24 GMT+02:00 Mike FABIAN via Unicode >> >:
>>> Asmus Freytag via Unicode >> > さんはかきました:
>>> Most emoji now have "W", for example:
>>> 
>>> 1F600..1F64F;W   # So[80] GRINNING FACE..PERSON WITH FOLDED HANDS
>>> 
>>> That seems correct because emoji behave more like Ideographs.
>>> 
>>> Isn’t this the same for “CIRCLED NUMBER TEN ON BLACK SQUARE”?
>>> This seems to me also more like an Ideograph.
>>>  
>>> Not really. They have existed since extremely long without being bound to 
>>> ideographs or sinographic requirements on metrics. Notably their baseline 
>>> and vertical extension do not follow the sinographic em-square layout 
>>> convention (except when they are rendered with CJK fonts, or were encoded 
>>> in documents with legacy CJK encodings, also rendered with suitable CJK 
>>> fonts being then prefered to Latin fonts which won't use the large 
>>> siongraphic metrics).
>>> 
>>> If they were like emojis, they would actually be larger : I think it is a 
>>> case for definining a Emoji-variant for them (where they could also be 
>>> colored or have some 3D-like look)
>> 
>> There's an emoji variant for the standard digits.
>> 
>> Do you speak about circled numbers ? I don't think so.
>> 
>> I (and Mike as well to which I was replying) was speaking about a good case 
>> for defining emoji variant of these circled (or squared) numbers (Mike spoke 
>> about circled number 10, which is not encoded as an emoji and not even as an 
>> ideograph, and that he proposed to give a wide width property like 
>> ideographs).
>> 
>> 
> 
> Are not CJK ideographs both (W)ide and (S)quare? Does (W)ide imply or define 
> that the ideograph should also be (S)quare?
> 
> It seems to me that there are many characters that are both (W)ide and 
> 

Re: CLDR 'B'

2017-06-05 Thread Peter Edberg via Unicode

> On Jun 5, 2017, at 1:20 AM, Neil Shadrach via Unicode  
> wrote:
> 
> 
> http://cldr.unicode.org/translation/date-time-patterns 
> 
> 
> How are 'B' values added for languages that do not have them?
> I cannot see an option for this in the survey tool which just refers to the 
> existing list.

If you want to override the inherited pattern for one of the 5 existing 'B' 
skeletons (Bh, Bhm, Bhms, EBhm, EBhms) you should be able to do that with no 
problem, please let us know if that does not work for you.

If in a particular locale you want to add another skeleton to the existing 5, 
please file a ticket:
http://unicode.org/cldr/trac/newticket 
(we shoud be able to get to that within a few days)

- Peter E