The OpenType spec doesn’t not in any way suggest that the bits be used that 
way. It’s impossible to assert that there are no applications out there that do 
that, but I wouldn’t expect there to be many widely-used apps that do that 
today.

On the other hand, something that the bits might affect are behaviours like 
font selection / font binding. For example, if you paste plain text into a 
rich-text app, it must select a default font for that text, since it’s a 
rich-text app. Now, an obvious choice would be to use the font applied to the 
characters on either side of the insertion point. But if it turned out that 
that font didn’t support the text being pasted, that would create a rendering 
problem; so the app probably wants to avoid that. An app just might use these 
bits as a heuristic to decide whether the current font can support the text or 
not.

I say that Unicode-range bits probably wouldn’t affect rendering in current 
apps, though that wasn’t necessarily the case in the past. Word 97 was one of 
the very first mainstream apps to support Unicode, but it was limited in the 
scripts that were actually supported. Word 2000 was still early in terms of 
mainstream Unicode support, and still had limitations. I recall working on font 
projects for Ethiopic and Yi scripts (with SIL at the time) and needing to set 
Unicode range or codepage bits in order to get text working in Word using our 
fonts One particular issue was a font-binding issue: Word would lump the Yi 
characters in with CJK (they’re not Western, and they’re not the few complex 
scripts that are supported, so assume they’re CJK), but wouldn’t allow the font 
to be applied until I set bits to make Word think the font supports CJK. But 
then with the Ethiopic font, there was a different effect — a rendering issue — 
that became apparent: Ethiopic characters have many different widths, but Word 
ignored the actual glyph metrics and displayed every glyph with the same width 
(the apparent assumption being that the characters are all CJK and all have the 
same width). Again, bits had to be set to make it observe the actual glyph 
metrics. IIRC, in one case I needed to set the Shift-JIS code page bit, and in 
the other case, to set a bit for one of the kana blocks.

But that was many years ago now. I can’t think of seeing Unicode-range bits 
affecting rendering in a long time.


Peter


From: Unicode [mailto:[email protected]] On Behalf Of Neil Patel via 
Unicode
Sent: Tuesday, February 27, 2018 8:46 AM
To: [email protected]; [email protected]
Subject: Re: Unicode Digest, Vol 50, Issue 20

Does the ulUnicodeRange bits get used to dictate rendering behavior or script 
recognition?

I am just wondering about whether the lack of bits to indicate an Adlam charset 
can cause other issues in applications.


-Neil


On Sat, Feb 24, 2018 at 1:00 PM, via Unicode 
<[email protected]<mailto:[email protected]>> wrote:
Send Unicode mailing list submissions to
        [email protected]<mailto:[email protected]>

To subscribe or unsubscribe via the World Wide Web, visit
        
http://unicode.org/mailman/listinfo/unicode<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Funicode.org%2Fmailman%2Flistinfo%2Funicode&data=04%7C01%7Cpetercon%40microsoft.com%7Cd33f1512e3cb480a15c008d57e02b5ea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636553472482173590%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=AN3NivzghKJ0RdryVYIMg4x90UimopMtJyj2Xox4vvg%3D&reserved=0>
or, via email, send a message with subject or body 'help' to
        [email protected]<mailto:[email protected]>

You can reach the person managing the list at
        [email protected]<mailto:[email protected]>

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Unicode digest..."

Today's Topics:

   1. Re: metric for block coverage (Norbert Lindenberg via Unicode)


---------- Forwarded message ----------
From: Norbert Lindenberg via Unicode 
<[email protected]<mailto:[email protected]>>
To: Khaled Hosny <[email protected]<mailto:[email protected]>>
Cc: James Kass <[email protected]<mailto:[email protected]>>, Adam 
Borowski <[email protected]<mailto:[email protected]>>, Unicode Public 
<[email protected]<mailto:[email protected]>>, Norbert Lindenberg 
<[email protected]<mailto:[email protected]>>
Bcc:
Date: Fri, 23 Feb 2018 10:15:32 -0800
Subject: Re: metric for block coverage

> On Feb 18, 2018, at 3:26 , Khaled Hosny via Unicode 
> <[email protected]<mailto:[email protected]>> wrote:
>
> On Sun, Feb 18, 2018 at 02:14:46AM -0800, James Kass via Unicode wrote:
>> Adam Borowski wrote,
>>
>>> I'm looking for a way to determine a font's coverage of available scripts.
>>> It's probably reasonable to do this per Unicode block.  Also, it's a safe
>>> assumption that a font which doesn't know a codepoint can do no complex
>>> shaping of such a glyph, thus looking at just codepoints should be adequate
>>> for our purposes.
>>
>> You probably already know that basic script coverage information is
>> stored internally in OpenType fonts in the OS/2 table.
>>
>> https://docs.microsoft.com/en-us/typography/opentype/spec/os2<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.microsoft.com%2Fen-us%2Ftypography%2Fopentype%2Fspec%2Fos2&data=04%7C01%7Cpetercon%40microsoft.com%7Cd33f1512e3cb480a15c008d57e02b5ea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636553472482173590%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=yWOSygwTOHSaBA%2BIWTGoF0OL6ucmxTJf0KSiXMpcuzg%3D&reserved=0>
>>
>> Parsing the bits in the "ulUnicodeRange..." entries may be the
>> simplest way to get basic script coverage info.
>
> Though this might not be very reliable since OpenType does not have a
> definition of what it means for a Unicode block to be supported; some
> font authoring tools use a percentage, others use the presence of any
> characters in the range, and fonts might even provide incorrect data for
> any reason.
>
> However, I don’t think script or block coverage is that useful, what
> users are usually interested in is the language coverage.
>
> Regards,
> Khaled


All true. In addition, ulUnicodeRange ran out of bits around Unicode 5.1, so 
scripts/blocks added to Unicode after that, such as Javanese, Tangut, or Adlam, 
cannot be represented.

Norbert




_______________________________________________
Unicode mailing list
[email protected]<mailto:[email protected]>
http://unicode.org/mailman/listinfo/unicode<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Funicode.org%2Fmailman%2Flistinfo%2Funicode&data=04%7C01%7Cpetercon%40microsoft.com%7Cd33f1512e3cb480a15c008d57e02b5ea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636553472482173590%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=AN3NivzghKJ0RdryVYIMg4x90UimopMtJyj2Xox4vvg%3D&reserved=0>

Reply via email to