On 9/22/2023 6:16 AM, Hamid,Idris wrote:


------ Original Message ------
 From "Hans Hagen" <j.ha...@xs4all.nl<mailto:j.ha...@xs4all.nl>>
To "Hamid,Idris" <idris.ha...@colostate.edu<mailto:idris.ha...@colostate.edu>>; "mailing list 
for ConTeXt users" <ntg-context@ntg.nl<mailto:ntg-context@ntg.nl>>
Date 9/21/2023 3:29:22 PM
Subject Re: [NTG-context] Re: Toggling the symbol for the zero-width joiner and 
related Unicode control characters
   Many thanks, Hans. The method appears to work only for nbsp, not zwj etc. 
Here is the updated MWE:

=======
\startTEXpage[offset=1em]
\disabletrackers[typesetters.directions]
\disabletrackers[typesetters.zwj]
\disabletrackers[typesetters.zwnj]
\disabletrackers[typesetters.nbsp]
\definedfont[almfixed at 14pt]
ZWJ: ‌
ZWNJ: ‍
NBSP:
\stopTEXpage
=======

See attached, please advise.
joiners are part of replacement etc and can come and go ... they are
characters (we could visualize them but one never knows for sure if one
sees them)

nbsp are spaces and become glue that we can trace reliable in the node list

Many thanks. Ok, here is another MWE featuring a workaround using fallbacks:

==============
\definefontfallback[nosymbols] [file:lmmono10-regular] [200C,200D] [force=yes]
\starttypescript [serif] [alm] [name]
     \definefontsynonym [Serif] [ArabicLatinSerif]
\stoptypescript
\starttypescript [mono] [alm] [name]
     \definefontsynonym [Mono]  [ArabicLatinMono]
\stoptypescript
\starttypescript [serif] [alm]
     \definefontsynonym [ArabicLatinSerif] [file:almfixed] % 
[fallbacks=nosymbols]
\stoptypescript
\starttypescript [mono] [alm]
     \definefontsynonym [ArabicLatinMono] [file:almfixed] [fallbacks=nosymbols]
\stoptypescript
\starttypescript [almfixed-nosymbols]
\definetypeface [\typescriptone] [rm] [serif] [alm] [default]
\definetypeface [\typescriptone] [tt] [mono] [alm] [default]
\stoptypescript
\usetypescript[almfixed-nosymbols]
\setupbodyfont[almfixed-nosymbols,12pt]
\startTEXpage[offset=1em]
\rm
ZWJ: ‌
ZWNJ: ‍
NBSP:
\tt
ZWJ: ‌
ZWNJ: ‍
NBSP:
\stopTEXpage
==============

Under \rm we get the symbols, and under \tt they are suppressed. Of course it 
doesn't matter what fallback font one uses, as long as it has no 
control-character symbols.

1. Can this approach be generalized to get what we want, viz., a way to toggle 
the symbols?

given the inconsistency in what is or is not in a font the only way out is to have our own visualization (consistent across fonts) and even then it would add some mess because we're talking of a mix of characters that can have gone (as part of rendering) or are not characters at all but spacing

so, in that case only 'verbatim' is a candidate for visualization, not so much typeset text

2. \enabletrackers[typesetters.nbsp] gives a colored box, which is at least 
something.. But how can we get the NBSP symbol that's alerady in the font?

it's gone by that time ... the line break mmechanism uses glue, not characters

3. Ideally:
a. we want all Unicode control symbols to show up in verbatim or in \typebuffer 
(as in a text editor);

only there (with some non interfering rendering i guess) and even then it's probably an additonal pass over the node list

b. we want all Unicode control symbols to be suppressed in final pdf output 
(for, e.g., printing).

they basically are unless some font features keeps them around which is out of our control

But some fonts meant for printing have symbols for Unicode control chars -- 
that poses a challenge.

so an inconsistent mess not worth wasting time on (as this is hobbyism only fun can be a motivational factir)

And some fonts meant for verbatim/editing do not have symbols for the control 
chars -- that also poses a challenge.  AlmFixed, of course, has them.

Most minimally decent Arabic fonts have symbols for the Unicode control chars 
as default, including Scheherazade, Amiri, Uthmanic, and Noto Naskh Arabic -- 
all free fonts.

Industry workhorses like Linotype Lotus (Arabic) also have them.

i'm not interested in those .. can't afford them for playing around
purposes .. we only look into commercial fonts if we get a dozen unresticted copies for context developers

Uniscribe applications like Notepad/Word allow for toggling in a WYSIWYG 
context -- can't speak for HarfBuzz -- so there is no harm in having explicit 
symbols in the font.

sure, as long as there is no rendering ... they show the input

The upshot is that, for non-Latin scripts, some toggling capability in ConTeXt 
is important to have -- even inescapable for Arabic-script piblishing.

a bit subjectiev arguing -)

Perhaps others who use Arabic-script or Indic, etc., can chime in.. Am hopeful 
that we can figure something out!
sore, but not with 'instant priority' (unless it is some project)

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : https://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : https://contextgarden.net
___________________________________________________________________________________

Reply via email to