> Le 26 févr. 2021 à 23:37, Jonathan Kew <[email protected]> a écrit : > > On 26/02/2021 22:00, Yannis Haralambous wrote: >> dear TeX-hyphen members, >> I'm new to this list (although not necessarily new to TeX hyphenation :-) >> Here is the problem: we are preparing hyphenation patterns for Uyghur, >> written in Arabic script. >> As letters must be in initial/medial form before the hyphen and medial/final >> form on the next line begin, >> I was wondering if we could change TeX internals so that instead of one, >> three hyphenchars are used: >> ^^^^200d and `-' on the upper line and ^^^^200d on the lower line, in order >> to obtain the equivalent >> of \discretionary{^^^^200d-}{^^^^200d}{}
Hi Jonathan, > The problem with this is that it wouldn't be the appropriate \discretionary > in the case where the letter before the hyphenation position is a right- > (rather than dual-) joining character. Sorry I don't understand what you mean. You mean when it is a biform character like the waw or the ra? In that case the ZWJ will do no harm. It is an invisible character that does not affect glyphs of biform characters. > So it's not sufficient to just have an extended form of \hyphenchar; we would > also need hyphenation patterns to record two different types of break > position: one for a break between joined letters, and one for a break between > non-joined letters. Not at all. In Arabic-script Uyghur you have only one rule: the glyphs of quadriform characters have to take initial/medial form before the hyphen and medial/final on the next line. When they are biform they remain as they are, and the ZWJ doesn't change them at all. > Or else the engine needs to know (perhaps from the Unicode properties of the > adjacent characters) which form to use -- but if we accept that the engine > can use knowledge of specific Unicode properties here, then it can take > responsibility for inserting the ZWJs internally, without needing to change > \hyphenchar. It is precisely to avoid these complications that I'm proposing to use ZWJ: the normal behavior of ZWJ is to change preceding quadriform isolated into initial and final into medial, and following isolated into final and initial into medial. If we can introduce it into the character string then the rendering engine will do the right thing. Am I wrong to think so? Yannis > > (On re-reading, perhaps that's more like what you meant anyway?) > > JK > >> Arthur said he would have a different solution. >> I would personally play with the DVI (resp. XDV) file, even though the >> widths of initial/medial forms >> are quite different from those of final/isolated forms, which would require >> a global redistribution of >> space in the line. >> Cheers, >> Yannis <http://www.imt-atlantique.fr/> Yannis HARALAMBOUS Professor Computer Science Department UMR CNRS 6285 Lab-STICC <http://perso.telecom-bretagne.eu/yannisharalambous/> <https://twitter.com/y_haralambous> <https://www.linkedin.com/in/yannis-haralambous-5529073?trk=hp-identity-name>Technopôle Brest-Iroise CS 83818 29238 Brest Cedex 3, France Une école de l'IMT <http://www.imt.fr/> Le tact dans l'audace, c'est de savoir jusqu'où on peut aller trop loin. (Jean Cocteau)
