Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-26 Thread Philip Taylor
Yannis Haralambous wrote: In fact, Arabic is not hyphenated. That is presumably because of the existence of the /kashida/, Yanni.  What is interesting is that the W3C notes that the Arabic /script/ (as opposed to the /language) /may  be hyphenated, and offers Uyghur as example — When

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
In fact, Arabic is not hyphenated. > Le 25 mars 2021 à 21:14, Philip Taylor a > écrit : > > Not being an Arabist, I have no idea whether the output from the following is > correct or not, but I thought that it would be fun to run Bruno's code on > some 'real' Arabic rather than just on

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Not being an Arabist, I have no idea whether the output from the following is correct or not, but I thought that it would be fun to run Bruno's code on some 'real' Arabic rather than just on sequences — % !TeX Program = Ini-XeTeX \let \dump = \relax \input xelatex.ini \begingroup      

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
What is the reason for the \makeatletter at line~29, Bruno ?  It appears to behave identically without it. -- /** Phil./

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
For those who want a complete working example based on Yiannis' code, here is one, to be compiled with "xetex -ini -etex test.tex" and in which the arabic "word" is hyphenated at every letter. Thank you Yannis and others. Bruno On 3/25/21 6:50 PM, Yannis Haralambous wrote: > Silly of me, when

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Silly of me, when adding the \lccode information also in the TeX file… it works. And not only oncein a word, but for all hyphenation points. I was persuaded having done so, but apparently I didn't. Anyway,now everything suddenly seems to work. Totally

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I posted a question on tex.stackexchange.com, let us see if somebody can give us more input: https://tex.stackexchange.com/questions/588952/zwj-not-working-in-hyphenation-patterns-in-xelatex > Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit : > > Check \the\catcode"200D perhaps, it does not

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Even when I reset \catcode to 11, the behavior is the same.Only Arabic words containing 200d are not hyphenated, others are hyphenated as expected. test-idea.pdf Description: Adobe PDF document Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit :Check \the\catcode"200D

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I checked it, \catcode"200D is 12 > Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit : > > Check \the\catcode"200D perhaps, it does not seem to be set in your example > document. Note that the LaTeX format might reset that catcode. > > The other mystery is why Arabic words seem to only be

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Check \the\catcode"200D perhaps, it does not seem to be set in your example document. Note that the LaTeX format might reset that catcode. The other mystery is why Arabic words seem to only be hyphenated once. On 3/25/21 5:17 PM, Yannis Haralambous wrote: > Well it is neither polyglossia nor

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Well it is neither polyglossia nor fontspec, because I have ran the following

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
And now, thanks to Jonathan, only one page — % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\# = 6 \catcode `\^ = 7 \catcode `\^^I = 10 \catcode `\% = 14 \let \bgroup = { \let \egroup = } \let \endgraf = \par \tracingonline = 1 \tracinglostchars = 2

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Jonathan Kew wrote: I don't see what \dimen 10 is doing in any of this. It never seems to get set, so I assume it remains zero. Ah, I took this from plain.tex's definition of \maxdimen, but it is possible (nay, probable, if not absolutely certain) that in the absence of plain.tex being

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
I don't see what \dimen 10 is doing in any of this. It never seems to get set, so I assume it remains zero. If so, you have \vsize zero, so minimal pages must be expected. On 25/03/2021 12:28, Philip Taylor wrote: Jonathan Kew wrote: See

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I checked the \catcode of 200d, it is 12. As for \XeTeXcharclass and \XeTeXinterchartoks they are used in bidi.sty but not for 200d I have found other pattern files where 200d is used, for example in Assamese: \patterns{ % GENERAL RULE % Do not break either side of ZERO-WIDTH JOINER

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Jonathan Kew wrote: See https://tug.org/pipermail/xetex/2014-January/025129.html Thank you, Jonathan.  Now formatted and included in the next iteration (code below, with "showhyphens.tex" attached).  I /believe/ that "showhyphens.tex" is functionally equivalent to David's code, but as he

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Zdenek Wagner
čt 25. 3. 2021 v 12:55 odesílatel Jonathan Kew napsal: > > On 25/03/2021 11:37, Yannis Haralambous wrote: > > OK, this may be a bug, but it doesn't explain why in > > polyglossia+fontspec I get no hyphenation at all. I should get at least > > one hyphenation in each word, no? > > My guess is that

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
On 25/03/2021 11:37, Yannis Haralambous wrote: OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I get no hyphenation at all. I should get at least one hyphenation in each word, no? My guess is that polyglossia (or something else in latex?) thinks it knows best how to

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I get no hyphenation at all. I should get at least one hyphenation in each word, no? > Le 25 mars 2021 à 12:24, Jonathan Kew a écrit : > > The attached sample (to be run with "xetex -ini -etex") seems to show that >

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
On 25/03/2021 11:24, Yannis Haralambous wrote: Well if the text width is 0pt it will only break once, no? Normally, I'd expect a break at every opportunity if the width is 0pt. In plain (xe)tex, \hsize 0pt \noindent \hskip 0pt supercalifragilisticexpialidocious \par gives me essentially

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
The attached sample (to be run with "xetex -ini -etex") seems to show that it's possible to get this working, at least with the (old-ish) xetex version I have on hand. It does appear to only want to use the first available hyphenation position in the Arabic "words"; offhand I'm not sure why

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Well if the text width is 0pt it will only break once, no? I begin to believe that in plain TeX 200d is treated just like any other character, so the problem lies not in XeTeX, but in polyglossia+fontspec somewhere there is some special treatment of 200d so that my LaTeX code doesn't

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
>> So what am I missing in my code, Jonathan ?  I have added a \hyphenchar, >> removed the pointless (and dysfunctional) \showhyphens), but still get no >> hyphenation — >> >> % !TeX Program=Ini-XeTeX >> >> \catcode `\ = 10 >> \catcode `\ = 10 >> \catcode `\{ = 1 >> \catcode `\} = 2 >>

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
On 25/03/2021 10:52, Philip Taylor wrote: Jonathan Kew wrote: Indeed, that \showhyphens will not work with native opentype fonts. This is a known difference between using TFM-based vs OT fonts; it's been discussed (and an alternative shown) on the list somewhere in the distant past. There

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Maybe the fact that it is the first word and hence is not preceded by a glue? > Le 25 mars 2021 à 12:10, Philip Taylor a > écrit : > > Jonathan Kew wrote: > >> The output you're getting shows that the hyphenation is in fact happening as >> expected. (To get a hyphen rather than a box, you

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Jonathan Kew wrote: The output you're getting shows that the hyphenation is in fact happening as expected. (To get a hyphen rather than a box, you need to set the font's \hyphenchar appropriately.) So what am I missing in my code, Jonathan ?  I have added a \hyphenchar, removed the

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Jonathan Kew wrote: Indeed, that \showhyphens will not work with native opentype fonts. This is a known difference between using TFM-based vs OT fonts; it's been discussed (and an alternative shown) on the list somewhere in the distant past. There is code, by Enrico Gregoria, for a

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Here is my code: \documentclass{article} \usepackage{polyglossia,fontspec} \setdefaultlanguage{arabic} \newfontfamily{\arabicfont}[Script=Arabic,Extension=.ttf,Scale=1.2]{Amiri-Regular} \textwidth1cm \begin{document} \large \lefthyphenmin1 \righthyphenmin1

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
On 25/03/2021 10:22, Yannis Haralambous wrote: When I run the same file with Amiri: % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\         = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\^ = 7 \font\tenrm="Amiri-Regular.ttf" \tenrm \lccode "200D = "200D \patterns {200d1200d}

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Exactly the same as I do. But where does the U+ character come from? And why isn't blabla hyphenated? > Le 25 mars 2021 à 11:38, Philip Taylor a > écrit : > >> \hbox(11.24+6.33998)x0.0, glue set - 1.0 [] >> >> Missing character: There is no >> (U+) in font [./Amiri-Regular! >> >>

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Yannis Haralambous wrote: No, I have the font in the same directory Still doesn't work for me after copying the four ttfs there — instead, if I don't want to use system font syntax, I have to use — \font \tenrm = [./Amiri-Regular.ttf] Then I get the following reports — This is XeTeX,

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Yannis Haralambous wrote: No, I have the font in the same directory OK, I installed it as a system font.  I will try copying the .ttfs to the source directory.

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
No, I have the font in the same directory > Le 25 mars 2021 à 11:30, Philip Taylor a > écrit : > > Yannis Haralambous wrote: >> When I run the same file with Amiri: >> >> % !TeX Program=Ini-XeTeX >> >> \catcode `\ = 10 >> \catcode `\ = 10 >> \catcode `\{ = 1 >> \catcode `\} = 2 >>

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Yannis Haralambous wrote: When I run the same file with Amiri: % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\         = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\^ = 7 \font\tenrm="Amiri-Regular.ttf" Does your hacked version of my code not generate this error for you, Yanni ?

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I get a missing character error for although I never request that char Overfull \hbox (11.96289pt too wide) in paragraph at lines 34--34 [] \tenrm #1 \hbox(11.23047+6.34764)x0.0, glue set - 1.0 [] Missing character: There is no (U+) in font [Amiri-Regular.ttf] ! Overfull \hbox

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Could you add the following two lines to the start of your file and check for "Missing character" messages? \tracingonline 1 \tracinglostchars 2 On 3/25/21 11:22 AM, Yannis Haralambous wrote: > When I run the same file with Amiri: > > % !TeX Program=Ini-XeTeX > > \catcode `\ = 10 > \catcode

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
When I run the same file with Amiri: % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\ = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\^ = 7 \font\tenrm="Amiri-Regular.ttf" \tenrm \lccode "200D = "200D \patterns {200d1200d} \def \showhyphens {

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Philip Taylor wrote: And I see no significant font using Amiri — "no significant /*difference */...".

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
And I see no significant font using Amiri — % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\ = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\^ = 7 \font \tenrm = "Amiri" \tenrm \lccode "200D = "200D \patterns {200d1200d} \def \showhyphens     {    

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Not entirely clear why this generates two pages of output, but it does seem to demonstrate the problem — % !TeX Program=Ini-XeTeX \catcode `\ = 10 \catcode `\ = 10 \catcode `\{ = 1 \catcode `\} = 2 \catcode `\^ = 7 \font \tenrm = cmr10 \tenrm \lccode "200D = "200D \patterns

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Arthur Rosendahl
On Thu, Mar 25, 2021 at 11:02:37AM +0100, Yannis Haralambous wrote: > (I would have prepared a minimal plain TeX example with Amiri but I don't > know neither how to load an OpenType font \font\myfont="Font Name" or \font\myfont="[./path/to/font.ttf]" Note the syntax: double

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Hi Bruno, when I test it with the Amiri font I get the same result, and Amiri contains a glyph for the 200d character. (I would have prepared a minimal plain TeX example with Amiri but I don't know neither how to load an OpenType font in plain TeX nor how to change writing direction) > Le

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Hello Yannis, On 3/25/21 10:41 AM, Yannis Haralambous wrote: > This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021) (preloaded > format=plain) >  restricted \write18 enabled. > (./test.tex > Underfull \hbox (badness 1) in paragraph at lines 6--6 > *[] \tenrm blabla* > [1] ) > (see

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Thanks for your reply Jonathan, whether with catcode 12 or 11, it still doesn't work. Here is a minimal example: \lccode"200D="200D \patterns{ 200d1200d } and then a file \lccode"200D="200D \catcode"200D=11 bla200d200dbla \showhyphens{bla200d200dbla} \end The log file

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
On 25/03/2021 09:27, Yannis Haralambous wrote: I noticed that the problem does not come from mapping. When I write 0643200d200d0643 in the text (with \lefthyphenmin=1) and \lccode"200D="200D and when I load a pattern 200d1200d I get no hyphenation at all, as if the