Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-26 Thread Philip Taylor

Yannis Haralambous wrote:


In fact, Arabic is not hyphenated.


That is presumably because of the existence of the /kashida/, Yanni.  
What is interesting is that the W3C notes that the Arabic /script/ (as 
opposed to the /language) /may  be hyphenated, and offers Uyghur as 
example —


When shaping scripts such as Arabic are allowed to break within words 
due to hyphenation, the characters must still be shaped as if the word 
were not broken  
(see § 5.6 Shaping Across Intra-word Breaks 
).


For example, if the Uyghur word “داميدى” were hyphenated, it would 
appear as [isolated DAL + isolated ALEF + initial MEEM + medial YEH + 
hyphen + line-break + final DAL + isolated ALEF MAKSURA] not as 
[isolated DAL + isolated ALEF + initial MEEM + final YEH + hyphen + 
line-break + isolated DAL + isolated ALEF MAKSURA]




--
/** Phil./



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
In fact, Arabic is not hyphenated.

> Le 25 mars 2021 à 21:14, Philip Taylor  a 
> écrit :
> 
> Not being an Arabist, I have no idea whether the output from the following is 
> correct or not, but I thought that it would be fun to run Bruno's code on 
> some 'real' Arabic rather than just on  sequences —
> 
> % !TeX Program = Ini-XeTeX
> 
> \let \dump = \relax
> \input xelatex.ini
> 
> \begingroup
>  \language = 8 % how to do this better?
>  \catcode "200D = 11
>  \lccode "200D = "200D
>  \patterns
>  {
>   064310643
>   200d1200d
>  }
> \endgroup
> 
> \documentclass {article}
> \usepackage {polyglossia, fontspec}
> \setdefaultlanguage {arabic}
> \newfontfamily {\arabicfont} [Script=Arabic, Scale=1.2] {Amiri-Regular}
> \textwidth = 1sp
> 
> \begin {document}
> \catcode "200D = 11 %JOINER
> \lccode "200D = "200D
> \lefthyphenmin = 1
> \righthyphenmin = 1
> 
> \language = 8
> \arabicfont
> 
>  ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
> ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
> ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
> ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
> ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
> ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت
> \end {document}

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Cette idée était si dramatique que, comme tous les grands drames,
elle a été réalisée depuis par le sort (Jean Giraudoux)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Not being an Arabist, I have no idea whether the output from the 
following is correct or not, but I thought that it would be fun to run 
Bruno's code on some 'real' Arabic rather than just on  sequences —


% !TeX Program = Ini-XeTeX

\let \dump = \relax
\input xelatex.ini

\begingroup
     \language = 8 % how to do this better?
     \catcode "200D = 11
     \lccode "200D = "200D
     \patterns
     {
          064310643
          200d1200d
     }
\endgroup

\documentclass {article}
\usepackage {polyglossia, fontspec}
\setdefaultlanguage {arabic}
\newfontfamily {\arabicfont} [Script=Arabic, Scale=1.2] {Amiri-Regular}
\textwidth = 1sp

\begin {document}
\catcode "200D = 11 %JOINER
\lccode "200D = "200D
\lefthyphenmin = 1
\righthyphenmin = 1

\language = 8
\arabicfont

 ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d 
ساڵیﻧﻮێتﭘﻴرۆزبێت200d200d ساڵیﻧﻮێتﭘﻴرۆزبێت

\end {document}


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
What is the reason for the \makeatletter at line~29, Bruno ?  It appears 
to behave identically without it.

--
/** Phil./


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
For those who want a complete working example based on Yiannis' code, here is
one, to be compiled with "xetex -ini -etex test.tex" and in which the arabic
"word" is hyphenated at every letter.

Thank you Yannis and others.

Bruno


On 3/25/21 6:50 PM, Yannis Haralambous wrote:
> Silly of me, when adding the \lccode information also in the TeX file… it 
> works.
> And not only once
> in a word, but for all hyphenation points. I was persuaded having done so, but
> apparently I didn't. Anyway,
> now everything suddenly seems to work. Totally weird…
> 
\let\dump\relax
\input xelatex.ini

\begingroup
  \language=8 % how to do this better?
  \catcode"200D=11
  \lccode"200D="200D
  \patterns{
  064310643
  200d1200d
  }
\endgroup

\documentclass{article}
\usepackage{polyglossia,fontspec}
\setdefaultlanguage{arabic}
\newfontfamily{\arabicfont}[Script=Arabic,Extension=.ttf,Scale=1.2]{Amiri-Regular}
\textwidth1mm
\begin{document}

\catcode"200D=11 %JOINER
\lccode"200D="200D
\lefthyphenmin1
\righthyphenmin1

\makeatletter\language8
\arabicfont

0643200d200d0643200d200d0643200d200d0643200d200d0643
0643200d200d0643200d200d0643200d200d0643200d200d0643
06430643064306430643
06430643064306430643

blabla
blabla
bla200d200dbla
bla200d200dbla

\end{document}


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Silly of me, when adding the \lccode information also in the TeX file… it works. And not only oncein a word, but for all hyphenation points. I was persuaded having done so, but apparently I didn't. Anyway,now everything suddenly seems to work. Totally weird…\documentclass{article}\usepackage{polyglossia,fontspec}\setdefaultlanguage{arabic}\newfontfamily{\arabicfont}[Script=Arabic,Extension=.ttf,Scale=1.2]{Amiri-Regular}\textwidth1mm\begin{document}\catcode"200D=11 %JOINER\lccode"200D="200D\lefthyphenmin1\righthyphenmin1\makeatletter\language8\arabicfont0643200d200d0643200d200d0643200d200d0643200d200d06430643200d200d0643200d200d0643200d200d0643200d200d06430643064306430643064306430643064306430643blablablablabla200d200dblabla200d200dbla\the\catcode"200D\end{document}with patterns \patterns{064310643200d1200d}gives the attached PDF

test-idea.pdf
Description: Adobe PDF document

Yannis HARALAMBOUSProfessorComputer Science DepartmentUMR CNRS 6285 Lab-STICCTechnopôle Brest-Iroise CS 8381829238 Brest Cedex 3, FranceUne école de l'IMTCeux qui négligent de relire s'obligent à lire partout la même histoire.     (Roland Barthes)


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I posted a question on tex.stackexchange.com, let us see if somebody can give 
us more input:

https://tex.stackexchange.com/questions/588952/zwj-not-working-in-hyphenation-patterns-in-xelatex

> Le 25 mars 2021 à 17:30, Bruno Le Floch  a écrit :
> 
> Check \the\catcode"200D perhaps, it does not seem to be set in your example
> document.  Note that the LaTeX format might reset that catcode.
> 
> The other mystery is why Arabic words seem to only be hyphenated once.
> 
> On 3/25/21 5:17 PM, Yannis Haralambous wrote:
>> Well it is neither polyglossia nor fontspec, because I have ran the 
>> following file:
>> 
>> \documentclass{article}
>> \textwidth1mm
>> \begin{document}
>> \font\arabicfont="[./Amiri-Regular.ttf]"
>> 
>> \lefthyphenmin1
>> \righthyphenmin1
>> 
>> \makeatletter\language\l@arabic
>> \arabicfont
>> 
>> 0643200d200d0643200d200d0643200d200d0643200d200d0643
>> 0643200d200d0643200d200d0643200d200d0643200d200d0643
>> 06430643064306430643
>> 06430643064306430643
>> 
>> \end{document}
>> 
>> and still get that odd behavior: between 0643 I get flawless hyphenation 
>> but
>> not between 200d
>> 
>> My patterns are \patterns{0643200d1200d0643
>> 064310643
>> 200d1200d
>> }
>> 
>> I'm attaching my xelatex.log file, do you see any file that can have affected
>> the behavior of 200d?
>> 
>> 
>> 
>>> Le 25 mars 2021 à 12:55, Jonathan Kew >> >> a écrit :
>>> 
>>> On 25/03/2021 11:37, Yannis Haralambous wrote:
 OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I
 get no hyphenation at all. I should get at least one hyphenation in each
 word, no?
>>> 
>>> My guess is that polyglossia (or something else in latex?) thinks it knows
>>> best how to handle U+200D and is getting in your way.
>> 
>> IMT Atlantique > >
>> *Yannis HARALAMBOUS*
>> Professor
>> Computer Science Department
>> UMR CNRS 6285 Lab-STICC
>> Site web IMT Atlantique
>> > >Twitter IMT Atlantique
>> > >LinkedIn IMT Atlantique
>> > >
>> Technopôle Brest-Iroise CS 83818
>> 29238 Brest Cedex 3, France
>> Une école de l'IMT >
>> 
>> /Pour le spectateur — et particulièrement pour l'historien de la peinture —
>> distinguer les couleurs mates des couleurs brillantes est indispensable.
>> Or comment repérer, simplement repérer, les couleurs mates sur l'écran d'un
>> ordinateur?/ (Michel Pastoureau)

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Ceux qui négligent de relire s'obligent à lire partout la même histoire. 
(Roland Barthes)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Even when I reset \catcode to 11, the behavior is the same.Only Arabic words containing 200d are not hyphenated, others are hyphenated as expected.

test-idea.pdf
Description: Adobe PDF document
Le 25 mars 2021 à 17:30, Bruno Le Floch  a écrit :Check \the\catcode"200D perhaps, it does not seem to be set in your exampledocument.  Note that the LaTeX format might reset that catcode.The other mystery is why Arabic words seem to only be hyphenated once.On 3/25/21 5:17 PM, Yannis Haralambous wrote:Well it is neither polyglossia nor fontspec, because I have ran the following file:\documentclass{article}\textwidth1mm\begin{document}\font\arabicfont="[./Amiri-Regular.ttf]"\lefthyphenmin1\righthyphenmin1\makeatletter\language\l@arabic\arabicfont0643200d200d0643200d200d0643200d200d0643200d200d06430643200d200d0643200d200d0643200d200d0643200d200d06430643064306430643064306430643064306430643\end{document}and still get that odd behavior: between 0643 I get flawless hyphenation butnot between 200dMy patterns are \patterns{0643200d1200d0643064310643200d1200d}I'm attaching my xelatex.log file, do you see any file that can have affectedthe behavior of 200d?Le 25 mars 2021 à 12:55, Jonathan Kew > a écrit :On 25/03/2021 11:37, Yannis Haralambous wrote:OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec Iget no hyphenation at all. I should get at least one hyphenation in eachword, no?My guess is that polyglossia (or something else in latex?) thinks it knowsbest how to handle U+200D and is getting in your way.IMT Atlantique 	*Yannis HARALAMBOUS*ProfessorComputer Science DepartmentUMR CNRS 6285 Lab-STICCSite web IMT AtlantiqueTwitter IMT AtlantiqueLinkedIn IMT AtlantiqueTechnopôle Brest-Iroise CS 8381829238 Brest Cedex 3, FranceUne école de l'IMT /Pour le spectateur — et particulièrement pour l'historien de la peinture —distinguer les couleurs mates des couleurs brillantes est indispensable.Or comment repérer, simplement repérer, les couleurs mates sur l'écran d'unordinateur?/     (Michel Pastoureau)
Yannis HARALAMBOUSProfessorComputer Science DepartmentUMR CNRS 6285 Lab-STICCTechnopôle Brest-Iroise CS 8381829238 Brest Cedex 3, FranceUne école de l'IMTLe tact dans l'audace, c'est de savoir jusqu'où on peut aller trop loin.     (Jean Cocteau)


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I checked it, \catcode"200D is 12

> Le 25 mars 2021 à 17:30, Bruno Le Floch  a écrit :
> 
> Check \the\catcode"200D perhaps, it does not seem to be set in your example
> document.  Note that the LaTeX format might reset that catcode.
> 
> The other mystery is why Arabic words seem to only be hyphenated once.
> 
> On 3/25/21 5:17 PM, Yannis Haralambous wrote:
>> Well it is neither polyglossia nor fontspec, because I have ran the 
>> following file:
>> 
>> \documentclass{article}
>> \textwidth1mm
>> \begin{document}
>> \font\arabicfont="[./Amiri-Regular.ttf]"
>> 
>> \lefthyphenmin1
>> \righthyphenmin1
>> 
>> \makeatletter\language\l@arabic
>> \arabicfont
>> 
>> 0643200d200d0643200d200d0643200d200d0643200d200d0643
>> 0643200d200d0643200d200d0643200d200d0643200d200d0643
>> 06430643064306430643
>> 06430643064306430643
>> 
>> \end{document}
>> 
>> and still get that odd behavior: between 0643 I get flawless hyphenation 
>> but
>> not between 200d
>> 
>> My patterns are \patterns{0643200d1200d0643
>> 064310643
>> 200d1200d
>> }
>> 
>> I'm attaching my xelatex.log file, do you see any file that can have affected
>> the behavior of 200d?
>> 
>> 
>> 
>>> Le 25 mars 2021 à 12:55, Jonathan Kew >> >> a écrit :
>>> 
>>> On 25/03/2021 11:37, Yannis Haralambous wrote:
 OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I
 get no hyphenation at all. I should get at least one hyphenation in each
 word, no?
>>> 
>>> My guess is that polyglossia (or something else in latex?) thinks it knows
>>> best how to handle U+200D and is getting in your way.
>> 
>> IMT Atlantique > >
>> *Yannis HARALAMBOUS*
>> Professor
>> Computer Science Department
>> UMR CNRS 6285 Lab-STICC
>> Site web IMT Atlantique
>> > >Twitter IMT Atlantique
>> > >LinkedIn IMT Atlantique
>> > >
>> Technopôle Brest-Iroise CS 83818
>> 29238 Brest Cedex 3, France
>> Une école de l'IMT >
>> 
>> /Pour le spectateur — et particulièrement pour l'historien de la peinture —
>> distinguer les couleurs mates des couleurs brillantes est indispensable.
>> Or comment repérer, simplement repérer, les couleurs mates sur l'écran d'un
>> ordinateur?/ (Michel Pastoureau)

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Der ganze Mensch bringt sich dar in der Art, wie er seine Worte darbringt. 
(Rudolf Koch)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Check \the\catcode"200D perhaps, it does not seem to be set in your example
document.  Note that the LaTeX format might reset that catcode.

The other mystery is why Arabic words seem to only be hyphenated once.

On 3/25/21 5:17 PM, Yannis Haralambous wrote:
> Well it is neither polyglossia nor fontspec, because I have ran the following 
> file:
> 
> \documentclass{article}
> \textwidth1mm
> \begin{document}
> \font\arabicfont="[./Amiri-Regular.ttf]"
> 
> \lefthyphenmin1
> \righthyphenmin1
> 
> \makeatletter\language\l@arabic
> \arabicfont
> 
> 0643200d200d0643200d200d0643200d200d0643200d200d0643
> 0643200d200d0643200d200d0643200d200d0643200d200d0643
> 06430643064306430643
> 06430643064306430643
> 
> \end{document}
> 
> and still get that odd behavior: between 0643 I get flawless hyphenation 
> but
> not between 200d
> 
> My patterns are \patterns{0643200d1200d0643
> 064310643
> 200d1200d
> }
> 
> I'm attaching my xelatex.log file, do you see any file that can have affected
> the behavior of 200d?
> 
> 
> 
>> Le 25 mars 2021 à 12:55, Jonathan Kew > > a écrit :
>>
>> On 25/03/2021 11:37, Yannis Haralambous wrote:
>>> OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I
>>> get no hyphenation at all. I should get at least one hyphenation in each
>>> word, no?
>>
>> My guess is that polyglossia (or something else in latex?) thinks it knows
>> best how to handle U+200D and is getting in your way.
> 
> IMT Atlantique  
> *Yannis HARALAMBOUS*
> Professor
> Computer Science Department
> UMR CNRS 6285 Lab-STICC
> Site web IMT Atlantique
> Twitter IMT Atlantique
> LinkedIn IMT Atlantique
> 
> Technopôle Brest-Iroise CS 83818
> 29238 Brest Cedex 3, France
> Une école de l'IMT 
> 
> /Pour le spectateur — et particulièrement pour l'historien de la peinture —
> distinguer les couleurs mates des couleurs brillantes est indispensable.
> Or comment repérer, simplement repérer, les couleurs mates sur l'écran d'un
> ordinateur?/     (Michel Pastoureau)
> 
> 



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Well it is neither polyglossia nor fontspec, because I have ran the following file:\documentclass{article}\textwidth1mm\begin{document}\font\arabicfont="[./Amiri-Regular.ttf]"\lefthyphenmin1\righthyphenmin1\makeatletter\language\l@arabic\arabicfont0643200d200d0643200d200d0643200d200d0643200d200d06430643200d200d0643200d200d0643200d200d0643200d200d06430643064306430643064306430643064306430643\end{document}and still get that odd behavior: between 0643 I get flawless hyphenation but not between 200dMy patterns are \patterns{0643200d1200d0643064310643200d1200d}I'm attaching my xelatex.log file, do you see any file that can have affected the behavior of 200d?

xelatex.log
Description: Binary data
Le 25 mars 2021 à 12:55, Jonathan Kew  a écrit :On 25/03/2021 11:37, Yannis Haralambous wrote:OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I get no hyphenation at all. I should get at least one hyphenation in each word, no?My guess is that polyglossia (or something else in latex?) thinks it knows best how to handle U+200D and is getting in your way.
Yannis HARALAMBOUSProfessorComputer Science DepartmentUMR CNRS 6285 Lab-STICCTechnopôle Brest-Iroise CS 8381829238 Brest Cedex 3, FranceUne école de l'IMTPour le spectateur — et particulièrement pour l'historien de la peinture —
distinguer les couleurs mates des couleurs brillantes est indispensable.
Or comment repérer, simplement repérer, les couleurs mates sur l'écran d'un ordinateur?     (Michel Pastoureau)


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

And now, thanks to Jonathan, only one page —

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\# = 6
\catcode `\^ = 7
\catcode `\^^I = 10
\catcode `\% = 14

\let \bgroup = {
\let \egroup = }
\let \endgraf = \par

\tracingonline = 1
\tracinglostchars = 2

\dimen 10 = 16383.8 pt

\input showhyphens
\message {\meaning \showhyphens}

\font \tenrm = "Amiri"
\tenrm
\hyphenchar \font = `\‐

\lccode "200D = "200D
\catcode "200D = 11
\patterns {200d1200d}

\def \text {foo200d200dbar}
\hsize = 0 pt
\vsize = \dimen 10
\noindent \ \text
\showhyphens \text
\end
--
/** Phil./
\def \loop #1\pool {\def \body {#1}\iterate}
\def \iterate {\body \let \next = \iterate \else \let \next = \relax \fi \next}
\let \pool = \fi

\def \showhyphenspace #1 
{%
\ifx \valign #1\valign
\else
#1\vadjust {\penalty 1 } \expandafter \showhyphenspace
\fi
}

\def \showhyphens #1%
{%
\begingroup
\showboxbreadth =\dimen 10 
\showboxdepth =\dimen 10 
\pretolerance = -1 
\tolerance = -1 
\setbox 2 = \hbox {}%
\setbox 0 = \vbox
\bgroup
\pretolerance = -1 
\tolerance = -1 
\hbadness = 0
\parfillskip = 0 pt 
\hsize = 1 sp
\noindent
\hskip 0 pt
\hfuzz =\dimen 10 
\hbadness =\dimen 10 
\showhyphenspace #1 {} \endgraf
\loop
\count 0 = 1 
\ifnum \lastnodetype = 1 
\setbox 4 = \lastbox
\setbox 2 = \hbox {\unhbox 4 
\unhbox 2}%
\count 0 = 0 
\fi
\ifnum \lastnodetype = 11 \unskip 
\count 0 = 0 \fi
\ifnum \lastnodetype = 13
 \count 2 = \lastpenalty
 \unpenalty
 \count 0 = 0 
 \ifnum \count 2 = 1 \setbox 2 
= \hbox { \unhbox 2 }\count 0 = 0 \fi
\fi
\ifnum \count 0 = 0
\pool
\hsize =\dimen 10 
\hfuzz = 0 pt
\hbadness 0
\par
\unhbox 2
\par
\egroup
\endgroup
}

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Jonathan Kew wrote:

I don't see what \dimen 10 is doing in any of this. It never seems to 
get set, so I assume it remains zero.


Ah, I took this from plain.tex's definition of \maxdimen, but it is 
possible (nay, probable, if not absolutely certain) that in the absence 
of plain.tex being \input, \dimen 10 will be zero as you say.  Thank 
you, Jonathan !

--
/** Phil./


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
I don't see what \dimen 10 is doing in any of this. It never seems to 
get set, so I assume it remains zero.


If so, you have \vsize zero, so minimal pages must be expected.

On 25/03/2021 12:28, Philip Taylor wrote:

Jonathan Kew wrote:


See https://tug.org/pipermail/xetex/2014-January/025129.html


Thank you, Jonathan.  Now formatted and included in the next iteration 
(code below, with "showhyphens.tex" attached).  I /believe/ that 
"showhyphens.tex" is functionally equivalent to David's code, but as he 
appears to have had a broken space bar and an intermittent "=" key, I 
cannot be sure that they are functionally identical ... And I /stil//l 
/don't understand why the code generates two pages.


% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\# = 6
\catcode `\^ = 7
\catcode `\^^I = 10
\catcode `\% = 14

\let \bgroup = {
\let \egroup = }
\let \endgraf = \par

\tracingonline = 1
\tracinglostchars = 2

\input showhyphens
\message {\meaning \showhyphens}

\font \tenrm = "Amiri"
\tenrm
\hyphenchar \font = `\‐

\lccode "200D = "200D
\catcode "200D = 11
\patterns {200d1200d}

\def \text {foo200d200dbar}
\hsize = 0 pt
\vsize = \dimen 10
\noindent \ \text
\showhyphens \text
\end
--
/** Phil./





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I checked the \catcode of 200d, it is 12.

As for \XeTeXcharclass and \XeTeXinterchartoks they are used in bidi.sty but 
not for 200d

I have found other pattern files where 200d is used, for example in Assamese:

\patterns{
% GENERAL RULE
% Do not break either side of ZERO-WIDTH JOINER  (U+200D)
2‍2
% Break on both sides of ZERO-WIDTH NON JOINER  (U+200C)
1‌1
% Break before or after any independent vowel.
অ1


> Le 25 mars 2021 à 13:26, Zdenek Wagner  a écrit :
> 
> I do not see how polyglossia could do it, maybe it can change its
> \catcode and you could display the value. It might use it in
> XeTeEcharclass and XeTeXinterchartoks but I am not sure whether these
> can be displayed by \showthe

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Comme on dit dans mon pays, c'est de l'identification, sans en être, tout en en 
étant (Raymond Queneau)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Jonathan Kew wrote:


See https://tug.org/pipermail/xetex/2014-January/025129.html


Thank you, Jonathan.  Now formatted and included in the next iteration 
(code below, with "showhyphens.tex" attached).  I /believe/ that 
"showhyphens.tex" is functionally equivalent to David's code, but as he 
appears to have had a broken space bar and an intermittent "=" key, I 
cannot be sure that they are functionally identical ... And I /stil//l 
/don't understand why the code generates two pages.


% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\# = 6
\catcode `\^ = 7
\catcode `\^^I = 10
\catcode `\% = 14

\let \bgroup = {
\let \egroup = }
\let \endgraf = \par

\tracingonline = 1
\tracinglostchars = 2

\input showhyphens
\message {\meaning \showhyphens}

\font \tenrm = "Amiri"
\tenrm
\hyphenchar \font = `\‐

\lccode "200D = "200D
\catcode "200D = 11
\patterns {200d1200d}

\def \text {foo200d200dbar}
\hsize = 0 pt
\vsize = \dimen 10
\noindent \ \text
\showhyphens \text
\end
--
/** Phil./

\def \loop #1\pool {\def \body {#1}\iterate}
\def \iterate {\body \let \next = \iterate \else \let \next = \relax \fi \next}
\let \pool = \fi

\def \showhyphenspace #1 
{%
\ifx \valign #1\valign
\else
#1\vadjust {\penalty 1 } \expandafter \showhyphenspace
\fi
}

\def \showhyphens #1%
{%
\begingroup
\showboxbreadth =\dimen 10 
\showboxdepth =\dimen 10 
\pretolerance = -1 
\tolerance = -1 
\setbox 2 = \hbox {}%
\setbox 0 = \vbox
\bgroup
\pretolerance = -1 
\tolerance = -1 
\hbadness = 0
\parfillskip = 0 pt 
\hsize = 1 sp
\noindent
\hskip 0 pt
\hfuzz =\dimen 10 
\hbadness =\dimen 10 
\showhyphenspace #1 {} \endgraf
\loop
\count 0 = 1 
\ifnum \lastnodetype = 1 
\setbox 4 = \lastbox
\setbox 2 = \hbox {\unhbox 4 
\unhbox 2}%
\count 0 = 0 
\fi
\ifnum \lastnodetype = 11 \unskip 
\count 0 = 0 \fi
\ifnum \lastnodetype = 13
 \count 2 = \lastpenalty
 \unpenalty
 \count 0 = 0 
 \ifnum \count 2 = 1 \setbox 2 
= \hbox { \unhbox 2 }\count 0 = 0 \fi
\fi
\ifnum \count 0 = 0
\pool
\hsize =\dimen 10 
\hfuzz = 0 pt
\hbadness 0
\par
\unhbox 2
\par
\egroup
\endgroup
}

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Zdenek Wagner
čt 25. 3. 2021 v 12:55 odesílatel Jonathan Kew  napsal:
>
> On 25/03/2021 11:37, Yannis Haralambous wrote:
> > OK, this may be a bug, but it doesn't explain why in
> > polyglossia+fontspec I get no hyphenation at all. I should get at least
> > one hyphenation in each word, no?
>
> My guess is that polyglossia (or something else in latex?) thinks it
> knows best how to handle U+200D and is getting in your way.
>
I do not see how polyglossia could do it, maybe it can change its
\catcode and you could display the value. It might use it in
XeTeEcharclass and XeTeXinterchartoks but I am not sure whether these
can be displayed by \showthe


> >
> >> Le 25 mars 2021 à 12:24, Jonathan Kew  >> > a écrit :
> >>
> >> The attached sample (to be run with "xetex -ini -etex") seems to show
> >> that it's possible to get this working, at least with the (old-ish)
> >> xetex version I have on hand.
> >>
> >> It does appear to only want to use the first available hyphenation
> >> position in the Arabic "words"; offhand I'm not sure why this is.
> >> Could well be some kind of bug.
> >>
> >
> > IMT Atlantique 
> > *Yannis HARALAMBOUS*
> > Professor
> > Computer Science Department
> > UMR CNRS 6285 Lab-STICC
> > Site web IMT Atlantique
> > Twitter IMT
> > Atlantique LinkedIn IMT Atlantique
> > 
> > Technopôle Brest-Iroise CS 83818
> > 29238 Brest Cedex 3, France
> > Une école del'IMT 
> >
> > /Логика –– это наука о законах мышления. Теперь я должен объяснить вам,
> > что такое наука, что такое закон и что такое мышление.
> > Что такое ‘о’, я объяснять не буду./ (Юрий И. Манин)
> >
> >
>

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew

On 25/03/2021 11:37, Yannis Haralambous wrote:
OK, this may be a bug, but it doesn't explain why in 
polyglossia+fontspec I get no hyphenation at all. I should get at least 
one hyphenation in each word, no?


My guess is that polyglossia (or something else in latex?) thinks it 
knows best how to handle U+200D and is getting in your way.




Le 25 mars 2021 à 12:24, Jonathan Kew > a écrit :


The attached sample (to be run with "xetex -ini -etex") seems to show 
that it's possible to get this working, at least with the (old-ish) 
xetex version I have on hand.


It does appear to only want to use the first available hyphenation 
position in the Arabic "words"; offhand I'm not sure why this is. 
Could well be some kind of bug.




IMT Atlantique  
*Yannis HARALAMBOUS*
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
Site web IMT Atlantique 
Twitter IMT 
Atlantique LinkedIn IMT Atlantique 


Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école del'IMT 

/Логика –– это наука о законах мышления. Теперь я должен объяснить вам,
что такое наука, что такое закон и что такое мышление.
Что такое ‘о’, я объяснять не буду./     (Юрий И. Манин)






Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I get 
no hyphenation at all. I should get at least one hyphenation in each word, no?

> Le 25 mars 2021 à 12:24, Jonathan Kew  a écrit :
> 
> The attached sample (to be run with "xetex -ini -etex") seems to show that 
> it's possible to get this working, at least with the (old-ish) xetex version 
> I have on hand.
> 
> It does appear to only want to use the first available hyphenation position 
> in the Arabic "words"; offhand I'm not sure why this is. Could well be some 
> kind of bug.
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Логика –– это наука о законах мышления. Теперь я должен объяснить вам,
что такое наука, что такое закон и что такое мышление.
Что такое ‘о’, я объяснять не буду. (Юрий И. Манин)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew

On 25/03/2021 11:24, Yannis Haralambous wrote:

Well if the text width is 0pt it will only break once, no?


Normally, I'd expect a break at every opportunity if the width is 0pt.

In plain (xe)tex,

  \hsize 0pt
  \noindent \hskip 0pt supercalifragilisticexpialidocious \par

gives me essentially one syllable per line.

But in the Arabic-script Amiri example this doesn't work for me. I 
suspect this is a bug in xetex.




I begin to believe that in plain TeX 200d is treated just like any 
other character,
so the problem lies not in XeTeX, but in polyglossia+fontspec somewhere 
there is some

special treatment of 200d so that my LaTeX code doesn't work.

I have opened an issue on the polyglossia github…

Le 25 mars 2021 à 12:18, Bruno Le Floch > a écrit :


Pushing some more in this direction, my code below (with 
Amiri-Regular.ttf being
in the same directory) hyphenates the second and third "blabla" 
(consistent with
Yannis' reminder that the first word is not hyphenated) but only 
hyphenates in a

single place (the first one) in the sequence of xyzt afterwards.


IMT Atlantique  
*Yannis HARALAMBOUS*
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
Site web IMT Atlantique 
Twitter IMT 
Atlantique LinkedIn IMT Atlantique 


Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école del'IMT 

/— Vous cherchez trop à comprendre, c'est un grave défaut.
— J'ai déjà entendu cette phrase. — Vous l'avez écrite./     (Jean Cocteau)






Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew
The attached sample (to be run with "xetex -ini -etex") seems to show 
that it's possible to get this working, at least with the (old-ish) 
xetex version I have on hand.


It does appear to only want to use the first available hyphenation 
position in the Arabic "words"; offhand I'm not sure why this is. Could 
well be some kind of bug.


On 25/03/2021 11:12, Yannis Haralambous wrote:
Maybe the fact that it is the first word and hence is not preceded by a 
glue?


Le 25 mars 2021 à 12:10, Philip Taylor > a écrit :


Jonathan Kew wrote:

The output you're getting shows that the hyphenation is in fact 
happening as expected. (To get a hyphen rather than a box, you need 
to set the font's \hyphenchar appropriately.)


So what am I missing in my code, Jonathan ?  I have added a 
\hyphenchar, removed the pointless (and dysfunctional) \showhyphens), 
but still get no hyphenation —


% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7

\tracingonline = 1
\tracinglostchars = 2

\font \tenrm = "Amiri"
\tenrm
\hyphenchar \font = `\‐

\lccode "200D = "200D
\catcode "200D = 11
\patterns {200d1200d}

\hsize = 0 pt
bla200d200dbla

\end

->


This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 
2021/W32TeX) (INITEX)

restricted \write18 enabled.
entering extended mode
(./untitled-4.tex
Overfull \hbox (23.1pt too wide) in paragraph at lines 21--22
[]\tenrm bla‍‍bla

\hbox(11.24+6.33998)x0.0 []

[0] )
Output written on untitled-4.pdf (1 page).
SyncTeX written on untitled-4.synctex.gz.

Transcript written on untitled-4.log.


--
/** Phil./



IMT Atlantique  
*Yannis HARALAMBOUS*
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
Site web IMT Atlantique 
Twitter IMT 
Atlantique LinkedIn IMT Atlantique 


Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école del'IMT 

/Ceux qui négligent de relire s'obligent à lire partout la même 
histoire./     (Roland Barthes)







amiri-hyph.tex
Description: TeX document


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Well if the text width is 0pt it will only break once, no?

I begin to believe that in plain TeX 200d is treated just like any other 
character,
so the problem lies not in XeTeX, but in polyglossia+fontspec somewhere there 
is some
special treatment of 200d so that my LaTeX code doesn't work.

I have opened an issue on the polyglossia github…

> Le 25 mars 2021 à 12:18, Bruno Le Floch  a écrit :
> 
> Pushing some more in this direction, my code below (with Amiri-Regular.ttf 
> being
> in the same directory) hyphenates the second and third "blabla" (consistent 
> with
> Yannis' reminder that the first word is not hyphenated) but only hyphenates 
> in a
> single place (the first one) in the sequence of xyzt afterwards.

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
— Vous cherchez trop à comprendre, c'est un grave défaut.
— J'ai déjà entendu cette phrase. — Vous l'avez écrite. (Jean Cocteau)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch


>> So what am I missing in my code, Jonathan ?  I have added a \hyphenchar,
>> removed the pointless (and dysfunctional) \showhyphens), but still get no
>> hyphenation —
>>
>> % !TeX Program=Ini-XeTeX
>>
>> \catcode `\ = 10
>> \catcode `\ = 10
>> \catcode `\{ = 1
>> \catcode `\} = 2
>> \catcode `\^ = 7
>>
>> \tracingonline = 1
>> \tracinglostchars = 2
>>
>> \font \tenrm = "Amiri"
>> \tenrm
>> \hyphenchar \font = `\‐
>>
>> \lccode "200D = "200D
>> \catcode "200D = 11
>> \patterns {200d1200d}
>>
>> \hsize = 0 pt
>> bla200d200dbla
>>
>> \end

Pushing some more in this direction, my code below (with Amiri-Regular.ttf being
in the same directory) hyphenates the second and third "blabla" (consistent with
Yannis' reminder that the first word is not hyphenated) but only hyphenates in a
single place (the first one) in the sequence of xyzt afterwards.

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\catcode"200D=11
\lccode "200D = "200D
\patterns {200d1200d}
\font \tenrm = [./Amiri-Regular.ttf]
\tenrm
\hyphenchar\tenrm=`-
\tracingonline 1
\tracinglostchars 2
\vsize=23cm

bla200d200dbla
bla200d200dbla
bla200d200dbla

\lefthyphenmin1
\righthyphenmin1

X
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a%
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a%
X

\end



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew

On 25/03/2021 10:52, Philip Taylor wrote:

Jonathan Kew wrote:

Indeed, that \showhyphens will not work with native opentype fonts. 
This is a known difference between using TFM-based vs OT fonts; it's 
been discussed (and an alternative shown) on the list somewhere in the 
distant past.


There is code, by Enrico Gregoria, for a replacement \showhypens buried 
deep in the file "xltxtra.dtx" (attached).  However, the code is 
completely incomprehensible to me, I have no idea how to convert a .dtx 
file into anything usable, and I cannot find a plain XeTeX equivalent.

--
/Philip Taylor/


See https://tug.org/pipermail/xetex/2014-January/025129.html



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Maybe the fact that it is the first word and hence is not preceded by a glue?

> Le 25 mars 2021 à 12:10, Philip Taylor  a 
> écrit :
> 
> Jonathan Kew wrote:
> 
>> The output you're getting shows that the hyphenation is in fact happening as 
>> expected. (To get a hyphen rather than a box, you need to set the font's 
>> \hyphenchar appropriately.)
> 
> So what am I missing in my code, Jonathan ?  I have added a \hyphenchar, 
> removed the pointless (and dysfunctional) \showhyphens), but still get no 
> hyphenation —
> 
> % !TeX Program=Ini-XeTeX
> 
> \catcode `\ = 10
> \catcode `\ = 10
> \catcode `\{ = 1
> \catcode `\} = 2
> \catcode `\^ = 7
> 
> \tracingonline = 1
> \tracinglostchars = 2
> 
> \font \tenrm = "Amiri"
> \tenrm
> \hyphenchar \font = `\‐
> 
> \lccode "200D = "200D
> \catcode "200D = 11
> \patterns {200d1200d}
> 
> \hsize = 0 pt
> bla200d200dbla
> 
> \end
> 
> ->
> 
> 
>> This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021/W32TeX) 
>> (INITEX)
>> restricted \write18 enabled.
>> entering extended mode
>> (./untitled-4.tex
>> Overfull \hbox (23.1pt too wide) in paragraph at lines 21--22
>> []\tenrm bla‍‍bla
>> 
>> \hbox(11.24+6.33998)x0.0 []
>> 
>> [0] )
>> Output written on untitled-4.pdf (1 page).
>> SyncTeX written on untitled-4.synctex.gz.
>> 
>> Transcript written on untitled-4.log.
>> 
> -- 
> ** Phil.
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Ceux qui négligent de relire s'obligent à lire partout la même histoire. 
(Roland Barthes)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Jonathan Kew wrote:

The output you're getting shows that the hyphenation is in fact 
happening as expected. (To get a hyphen rather than a box, you need to 
set the font's \hyphenchar appropriately.)


So what am I missing in my code, Jonathan ?  I have added a \hyphenchar, 
removed the pointless (and dysfunctional) \showhyphens), but still get 
no hyphenation —


% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7

\tracingonline = 1
\tracinglostchars = 2

\font \tenrm = "Amiri"
\tenrm
\hyphenchar \font = `\‐

\lccode "200D = "200D
\catcode "200D = 11
\patterns {200d1200d}

\hsize = 0 pt
bla200d200dbla

\end

->


This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021/W32TeX) 
(INITEX)


restricted \write18 enabled.

entering extended mode

(./untitled-4.tex

Overfull \hbox (23.1pt too wide) in paragraph at lines 21--22

[]\tenrm bla‍‍bla


\hbox(11.24+6.33998)x0.0 []


[0] )

Output written on untitled-4.pdf (1 page).

SyncTeX written on untitled-4.synctex.gz.


Transcript written on untitled-4.log.



--
/** Phil./



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Jonathan Kew wrote:

Indeed, that \showhyphens will not work with native opentype fonts. 
This is a known difference between using TFM-based vs OT fonts; it's 
been discussed (and an alternative shown) on the list somewhere in the 
distant past.


There is code, by Enrico Gregoria, for a replacement \showhypens buried 
deep in the file "xltxtra.dtx" (attached).  However, the code is 
completely incomprehensible to me, I have no idea how to convert a .dtx 
file into anything usable, and I cannot find a plain XeTeX equivalent.

--
/Philip Taylor/
% \iffalse
%%%
%% THE XLTXTRA PACKAGE
%%%
%%
%% Copyright (C) 2006-2018 by Will Robertson 
%%
%% Distributable under the LaTeX Project Public License,
%% version 1.3c or higher (your choice). The latest version of
%% this license is at: http://www.latex-project.org/lppl.txt
%%
%% This work is "maintained" (as per LPPL maintenance status)
%% by Will Robertson.
%%
%% This work consists of the files xltxtra.dtx and xltxtra.ins,
%%   and the derived files xltxtra.sty and xltxtra.pdf.
%%
%%%
%<*driver>
\documentclass{ltxdoc}

\makeatletter

\def\@dotsep{1000}
\setcounter{tocdepth}{2}

\setcounter{IndexColumns}{2}
\renewenvironment{theglossary}
  {\small\list{}{}
 \item\relax
 \glossary@prologue\GlossaryParms
 \let\item\@idxitem \ignorespaces
 \def\pfill{\hspace*{\fill}}}
  {\endlist}

\usepackage{array,booktabs,calc,color,enumitem,fancyvrb,graphicx,ifthen,longtable,refstyle,varioref,xltxtra}
\usepackage[rm]{titlesec}
\setmainfont{texgyrepagella}[
  Extension = .otf ,
  UprightFont = *-regular ,
  ItalicFont = *-italic ,
  BoldFont = *-bold ,
  BoldItalicFont = *-bolditalic ,
]
\setsansfont{texgyreheros}[
  Scale=MatchLowercase ,
  Extension = .otf ,
  UprightFont = *-regular ,
  ItalicFont = *-italic ,
  BoldFont = *-bold ,
  BoldItalicFont = *-bolditalic ,
]
\setmonofont{lmmono10}[
  Extension = .otf ,
  UprightFont = *-regular ,
  ItalicFont = *-italic ,
]

\linespread{1.05}  % A bit more space between lines
\frenchspacing % Remove space after punctuation

\setlogokern{Xe}{-0.061em}
\setlogokern{eL}{-0.057em}
\setlogokern{La}{-0.28em}
\setlogokern{aT}{-0.10em}
\setlogokern{Te}{-0.0575em}
\setlogokern{eX}{-0.072em}
\setlogokern{eT}{-0.056em}
\setlogokern{X2}{0.1667em}
\setlogodrop{0.153em}
\setLaTeXa{\scshape a}
\setLaTeXee{\mbox{\fontspec{Times}\itshape ε}}
\def\eTeX{{\fontspec{Times}\textit{ε}{}}-\TeX} % The eTeX logo is not (yet) covered by metalogo

\definecolor{niceblue}{rgb}{0.2,0.4,0.6}
\def\theCodelineNo{\textcolor{niceblue}{\sffamily\tiny\arabic{CodelineNo}}}

\newenvironment{example}
  {\VerbatimEnvironment
   \begin{trivlist}\item[]
   \begin{minipage}{\linewidth}
   \par\noindent\hrulefill\par
   \begin{VerbatimOut}[gobble=4]{\examplefilename}}
  {\end{VerbatimOut}\relax
   \begingroup
 \color{niceblue}
 \typesetexample
   \endgroup\par\noindent\hrulefill\par
   \end{minipage}\end{trivlist}}

\let\examplesize\normalsize
\let\auxwidth\relax

\newlength\examplewidth\newlength\verbatimwidth
\newlength\exoutdent   \newlength\exverbgap
\setlength\exverbgap{0em}
\setlength\exoutdent{-0\textwidth}
\newsavebox\verbatimbox
\edef\examplefilename{xltxtra.example}

\newcommand\typesetexample{\relax
   \begin{lrbox}{\verbatimbox}\relax
 \BVerbatimInput[fontsize=\small]{\examplefilename}\relax
   \end{lrbox}
   \begin{list}{}{\setlength\itemindent{0pt}
  \setlength\leftmargin\exoutdent
  \setlength\rightmargin{0pt}}\item
   \ifx\auxwidth\relax
 \setlength\verbatimwidth{\wd\verbatimbox}\relax
   \else
 \setlength\verbatimwidth{\auxwidth}\relax
   \fi
   \begin{minipage}[c]{\textwidth-\exoutdent-\verbatimwidth-\exverbgap}
 \catcode`\%=14\centering\examplesize\input\examplefilename\relax
   \end{minipage}\hfill
   \begin{minipage}[c]{\verbatimwidth}
 \usebox\verbatimbox
   \end{minipage}
   \end{list}
   \global\let\examplesize\normalsize}

\newcommand*\setexsize[1]{\let\examplesize#1}
\newcommand*\setverbwidth[1]{\def\auxwidth{#1}}

\newcommand*\name[1]{{#1}}
\newcommand*\pkg[1]{\texttt{#1}}
\newcommand*\pkgopt[1]{\texttt{[#1]} package option}

\newcommand*\acro[1]{\textsc{\MakeLowercase{#1}}}

\newcommand*\note[1]{\unskip\footnote{#1}}

\let\latin\textit
\def\eg{\latin{e.g.}}
\def\Eg{\latin{E.g.}}
\def\ie{\latin{i.e.}}
\def\etc{\@ifnextchar.{\latin{etc}}{\latin{etc.}\@}}

\newcounter{argument}
\g@addto@macro\endmacro{\setcounter{argument}{0}}
\newcommand*\darg[1]{%
  \stepcounter{argument}%
  \noindent{\ttfamily\#\theargument}:~#1\par\nobreak}
\newcommand*\doarg[1]{%
  \stepcounter{argument}%
  \noindent{\ttfamily\makebox[0pt][r]{[}\#\theargument]}:~#1\par\nobreak}

\newcommand\unichar[2]{\textsc{\MakeLowercase{u+#1: #2}}}

\makeatother

\EnableCrossrefs
\CodelineIndex
\begin{document}
  

Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Here is my code:

\documentclass{article}
\usepackage{polyglossia,fontspec}
\setdefaultlanguage{arabic}
\newfontfamily{\arabicfont}[Script=Arabic,Extension=.ttf,Scale=1.2]{Amiri-Regular}
\textwidth1cm
\begin{document}
\large
\lefthyphenmin1
\righthyphenmin1

0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a
0643200d200d0643062a0643200d200d0643062a0643200d200d0643062a

\end{document}

and the patterns I have loaded for Arabic are 

\catcode"200D=11
\lccode"200D="200D
\patterns{
200d1200d
}

I don't get any hyphenation at all. When I remove the 200d and try a 
pattern between Arabic characters it works just fine,
so it is not a problem inherent in Arabic script.


 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Final exercise: Find all of the lies in this book,
and all of the jokes. (Donald E. Knuth)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew

On 25/03/2021 10:22, Yannis Haralambous wrote:

When I run the same file with Amiri:

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\         = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"
\tenrm

\lccode "200D = "200D
\patterns {200d1200d}

\def \showhyphens
         {
                 \setbox 0 = \vbox
                         {
                                 \parfillskip = 0 pt
                                 \hsize = \dimen 10
                                 \tenrm
                                 \pretolerance = -1
                                 \tolerance = -1
                                 \hbadness = 0
                                 \showboxdepth = 0 \ #1
                         }
         }

\lccode "200D = "200D
\catcode "200D = 11
bla200d200dbla
\showhyphens {bla200d200dbla}

\end

I get three pages, one with blabla, one with bla and a box, and another 
with bla.

Maybe \showhyphens is not correctly defined in the code above?


Indeed, that \showhyphens will not work with native opentype fonts. This 
is a known difference between using TFM-based vs OT fonts; it's been 
discussed (and an alternative shown) on the list somewhere in the 
distant past.


The output you're getting shows that the hyphenation is in fact 
happening as expected. (To get a hyphen rather than a box, you need to 
set the font's \hyphenchar appropriately.)





Le 25 mars 2021 à 11:20, Philip Taylor > a écrit :


Philip Taylor wrote:


And I see no significant font using Amiri —


"no significant /*difference */...".


IMT Atlantique  
*Yannis HARALAMBOUS*
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
Site web IMT Atlantique 
Twitter IMT 
Atlantique LinkedIn IMT Atlantique 


Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école del'IMT 

/Und nach einer kleinen Stille fügte Sie hinzu:
Jeder Weg, der dorthin führt, war am Ende der richtige./     (Michael Ende)






Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Exactly the same as I do. But where does the U+ character come from?

And why isn't blabla hyphenated?

> Le 25 mars 2021 à 11:38, Philip Taylor  a 
> écrit :
> 
>> \hbox(11.24+6.33998)x0.0, glue set - 1.0 []
>> 
>> Missing character: There is no 
>> (U+) in font [./Amiri-Regular!
>> 
>> Overfull \hbox (23.1pt too wide) in paragraph at lines 34--36
>> []\tenrm bla‍‍bla
>> 
>> \hbox(11.24+6.33998)x0.0 []
>> 
>> 
>> Overfull \hbox (15.19pt too wide) in paragraph at lines 34--36
>> \tenrm bla‍
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Notre langue n'est que la prononciation défectueuse de quelques autres 
(Marcel Proust)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Yannis Haralambous wrote:


No, I have the font in the same directory


Still doesn't work for me after copying the four ttfs there — instead, 
if I don't want to use system font syntax, I have to use —



\font \tenrm = [./Amiri-Regular.ttf]


Then I get the following reports —

This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021/W32TeX) 
(INITEX)


restricted \write18 enabled.

entering extended mode

(./untitled-3.tex

Overfull \hbox (11.94667pt too wide) in paragraph at lines 35--35

[] \tenrm #1


\hbox(11.24+6.33998)x0.0, glue set - 1.0 []


Missing character: There is no

(U+) in font [./Amiri-Regular!


Overfull \hbox (23.1pt too wide) in paragraph at lines 34--36

[]\tenrm bla‍‍bla


\hbox(11.24+6.33998)x0.0 []



Overfull \hbox (15.19pt too wide) in paragraph at lines 34--36

\tenrm bla‍



\hbox(11.24+6.33998)x0.0 []



Overfull \hbox (11.55pt too wide) in paragraph at lines 34--36

\tenrm ‍bla


\hbox(11.24+6.33998)x0.0 []


[0] [0] [0] )

Output written on untitled-3.pdf (3 pages).

SyncTeX written on untitled-3.synctex.gz.


Transcript written on untitled-3.log.







Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Yannis Haralambous wrote:


No, I have the font in the same directory


OK, I installed it as a system font.  I will try copying the .ttfs to 
the source directory.


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
No, I have the font in the same directory

> Le 25 mars 2021 à 11:30, Philip Taylor  a 
> écrit :
> 
> Yannis Haralambous wrote:
>> When I run the same file with Amiri:
>> 
>> % !TeX Program=Ini-XeTeX
>> 
>> \catcode `\ = 10
>> \catcode `\ = 10
>> \catcode `\{ = 1
>> \catcode `\} = 2
>> \catcode `\^ = 7
>> \font\tenrm="Amiri-Regular.ttf"
> 
> Does your hacked version of my code not generate this error for you, Yanni ?
> 
>> This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021/W32TeX) 
>> (INITEX)
>> restricted \write18 enabled.
>> entering extended mode
>> (./untitled-3.texname = Amiri-Regular, rootname = Amiri-Regular, pointsize =
>> mktexmf: empty or non-existent rootfile!
>> 
>> kpathsea: Running mktexmf Amiri-Regular.mf
>> 
>> The command name is C:\TeX\Live\2021\bin\win32\mktexmf
>> Cannot find Amiri-Regular.mf.
>> 
>> ! Font \tenrm=Amiri-Regular.ttf not loadable: Metric (TFM) file or installed 
>> fo
>> nt not found.
>> 
>> \tenrm
>> l.9 \tenrm
>> ?
> -- 
> ** Phil.
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Todo lenguaje es un alfabeto de símbolos cuyo ejercicio presupone un pasado
que los interlocutores comparten; ¿cómo transmitir a los otros el infinito 
Aleph,
que mi temerosa memoria apenas abarca? (Jorge Luis Borges)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Yannis Haralambous wrote:

When I run the same file with Amiri:

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\         = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"


Does your hacked version of my code not generate this error for you, Yanni ?

This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021/W32TeX) 
(INITEX)


restricted \write18 enabled.

entering extended mode

(./untitled-3.texname = Amiri-Regular, rootname = Amiri-Regular, 
pointsize =


mktexmf: empty or non-existent rootfile!


kpathsea: Running mktexmf Amiri-Regular.mf


The command name is C:\TeX\Live\2021\bin\win32\mktexmf

Cannot find Amiri-Regular.mf.


! Font \tenrm=Amiri-Regular.ttf not loadable: Metric (TFM) file or 
installed fo


nt not found.



\tenrm

l.9 \tenrm

?


--
/** Phil./



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
I get a missing character error for  although I never request that char

Overfull \hbox (11.96289pt too wide) in paragraph at lines 34--34
[] \tenrm #1

\hbox(11.23047+6.34764)x0.0, glue set - 1.0 []

Missing character: There is no 
 (U+) in font [Amiri-Regular.ttf]
!

Overfull \hbox (23.125pt too wide) in paragraph at lines 33--35
[]\tenrm bla‍‍bla

\hbox(11.23047+6.34764)x0.0 []


Overfull \hbox (15.21484pt too wide) in paragraph at lines 33--35
\tenrm bla‍


\hbox(11.23047+6.34764)x0.0 []


Overfull \hbox (11.5625pt too wide) in paragraph at lines 33--35
\tenrm ‍bla

\hbox(11.23047+6.34764)x0.0 []

[0] [0] [0] )
Output written on test-amiri.xdv (3 pages, 652 bytes).
Transcript written on test-amiri.log.


> Le 25 mars 2021 à 11:24, Bruno Le Floch  a écrit :
> 
> \tracingonline 1
> \tracinglostchars 2
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Cette idée était si dramatique que, comme tous les grands drames,
elle a été réalisée depuis par le sort (Jean Giraudoux)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Could you add the following two lines to the start of your file and check for
"Missing character" messages?

\tracingonline 1
\tracinglostchars 2



On 3/25/21 11:22 AM, Yannis Haralambous wrote:
> When I run the same file with Amiri:
> 
> % !TeX Program=Ini-XeTeX
> 
> \catcode `\ = 10
> \catcode `\         = 10
> \catcode `\{ = 1
> \catcode `\} = 2
> \catcode `\^ = 7
> \font\tenrm="Amiri-Regular.ttf"
> \tenrm
> 
> \lccode "200D = "200D
> \patterns {200d1200d}
> 
> \def \showhyphens 
>         {
>                 \setbox 0 = \vbox 
>                         {
>                                 \parfillskip = 0 pt 
>                                 \hsize = \dimen 10
>                                 \tenrm 
>                                 \pretolerance = -1
>                                 \tolerance = -1
>                                 \hbadness = 0
>                                 \showboxdepth = 0 \ #1
>                         }
>         }
> 
> \lccode "200D = "200D
> \catcode "200D = 11
> bla200d200dbla
> \showhyphens {bla200d200dbla}
> 
> \end
> 
> I get three pages, one with blabla, one with bla and a box, and another with 
> bla.
> Maybe \showhyphens is not correctly defined in the code above?
> 
>> Le 25 mars 2021 à 11:20, Philip Taylor > > a écrit :
>>
>> Philip Taylor wrote:
>>
>>> And I see no significant font using Amiri —
>>
>> "no significant /*difference */...".
> 
> IMT Atlantique  
> *Yannis HARALAMBOUS*
> Professor
> Computer Science Department
> UMR CNRS 6285 Lab-STICC
> Site web IMT Atlantique
> Twitter IMT Atlantique
> LinkedIn IMT Atlantique
> 
> Technopôle Brest-Iroise CS 83818
> 29238 Brest Cedex 3, France
> Une école de l'IMT 
> 
> /Und nach einer kleinen Stille fügte Sie hinzu:
> Jeder Weg, der dorthin führt, war am Ende der richtige./     (Michael Ende)
> 
> 



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
When I run the same file with Amiri:

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"
\tenrm

\lccode "200D = "200D
\patterns {200d1200d}

\def \showhyphens 
{
\setbox 0 = \vbox 
{
\parfillskip = 0 pt 
\hsize = \dimen 10
\tenrm 
\pretolerance = -1
\tolerance = -1
\hbadness = 0
\showboxdepth = 0 \ #1
}
}

\lccode "200D = "200D
\catcode "200D = 11
bla200d200dbla
\showhyphens {bla200d200dbla}

\end

I get three pages, one with blabla, one with bla and a box, and another with 
bla.
Maybe \showhyphens is not correctly defined in the code above?

> Le 25 mars 2021 à 11:20, Philip Taylor  a 
> écrit :
> 
> Philip Taylor wrote:
> 
>> And I see no significant font using Amiri —
> 
> "no significant difference ...".

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
Und nach einer kleinen Stille fügte Sie hinzu:
Jeder Weg, der dorthin führt, war am Ende der richtige. (Michael Ende)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

Philip Taylor wrote:


And I see no significant font using Amiri —


"no significant /*difference */...".


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor

And I see no significant font using Amiri —

% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font \tenrm = "Amiri"
\tenrm

\lccode "200D = "200D
\patterns {200d1200d}

\def \showhyphens
    {
    \setbox 0 = \vbox
    {
    \parfillskip = 0 pt
    \hsize = \dimen 10
    \tenrm
    \pretolerance = -1
    \tolerance = -1
    \hbadness = 0
    \showboxdepth = 0 \ #1
    }
    }

\lccode "200D = "200D
\catcode "200D = 11
bla200d200dbla
\showhyphens {bla200d200dbla}

\end


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Philip Taylor
Not entirely clear why this generates two pages of output, but it does 
seem to demonstrate the problem —


% !TeX Program=Ini-XeTeX

\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font \tenrm = cmr10
\tenrm

\lccode "200D = "200D
\patterns {200d1200d}

\def \showhyphens
    {
    \setbox 0 = \vbox
    {
    \parfillskip = 0 pt
    \hsize = \dimen 10
    \tenrm
    \pretolerance = -1
    \tolerance = -1
    \hbadness = 0
    \showboxdepth = 0 \ #1
    }
    }

\lccode "200D = "200D
\catcode "200D = 11
bla200d200dbla
\showhyphens {bla200d200dbla}

\end
--
/Philip Taylor/



Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Arthur Rosendahl
On Thu, Mar 25, 2021 at 11:02:37AM +0100, Yannis Haralambous wrote:
> (I would have prepared a minimal plain TeX example with Amiri but I don't 
> know neither how to load an OpenType font

\font\myfont="Font Name"

or

\font\myfont="[./path/to/font.ttf]"

  Note the syntax: double quotes for font names (as reported by
fontconfig); double quotes with square brackets when you specify the
path to the font file.

> in plain TeX nor how to change writing direction)

  \beginR ... \endR

Arthur


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Hi Bruno,

when I test it with the Amiri font I get the same result, and Amiri contains a 
glyph for the 200d character.

(I would have prepared a minimal plain TeX example with Amiri but I don't know 
neither how to load an OpenType font
in plain TeX nor how to change writing direction)

> Le 25 mars 2021 à 10:58, Bruno Le Floch  a écrit :
> 
> Hello Yannis,
> 
> On 3/25/21 10:41 AM, Yannis Haralambous wrote:
>> This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021) (preloaded
>> format=plain)
>>  restricted \write18 enabled.
>> (./test.tex
>> Underfull \hbox (badness 1) in paragraph at lines 6--6
>> *[] \tenrm blabla*
>> [1] )
>> (see the transcript file for additional information)
>> Output written on test.xdv (1 page, 236 bytes).
>> Transcript written on test.log.
>> 
>> which means that the pattern has not been applied.
> 
> When running essentially your code, I get a missing character message.  So the
> issue here is simply that the box ends up reading "blabla", so the patterns
> involving 200D are not relevant.  Naively I would expect the pattern to 
> work
> correctly if your font has this character.
> 
> Missing character: There is no ‍ in font cmr10!
> 
> Best regards,
> Bruno

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
...il n'était pas loin de se comparer à celui qui,
jadis, avait donné le feu aux humains et auquel d'ailleurs
il trouverait un jour un nom, il se le promettait. (Pierre Davy)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Bruno Le Floch
Hello Yannis,

On 3/25/21 10:41 AM, Yannis Haralambous wrote:
> This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021) (preloaded
> format=plain)
>  restricted \write18 enabled.
> (./test.tex
> Underfull \hbox (badness 1) in paragraph at lines 6--6
> *[] \tenrm blabla*
> [1] )
> (see the transcript file for additional information)
> Output written on test.xdv (1 page, 236 bytes).
> Transcript written on test.log.
> 
> which means that the pattern has not been applied.

When running essentially your code, I get a missing character message.  So the
issue here is simply that the box ends up reading "blabla", so the patterns
involving 200D are not relevant.  Naively I would expect the pattern to work
correctly if your font has this character.

Missing character: There is no ‍ in font cmr10!

Best regards,
Bruno


Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Yannis Haralambous
Thanks for your reply Jonathan,

whether with catcode 12 or 11, it still doesn't work.

Here is a minimal example:

\lccode"200D="200D
\patterns{
200d1200d
}

and then a file

\lccode"200D="200D
\catcode"200D=11
bla200d200dbla
\showhyphens{bla200d200dbla}
\end

The log file says:

This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021) (preloaded 
format=plain)
 restricted \write18 enabled.
(./test.tex
Underfull \hbox (badness 1) in paragraph at lines 6--6
[] \tenrm blabla
[1] )
(see the transcript file for additional information)
Output written on test.xdv (1 page, 236 bytes).
Transcript written on test.log.

which means that the pattern has not been applied.

> Le 25 mars 2021 à 10:37, Jonathan Kew  a écrit :
> 
> On 25/03/2021 09:27, Yannis Haralambous wrote:
>> I noticed that the problem does not come from mapping. When I write
>> 0643200d200d0643
>> in the text (with \lefthyphenmin=1) and \lccode"200D="200D
>> and when I load a pattern
>> 200d1200d
>> I get no hyphenation at all, as if the 200d character is not allowed in 
>> hyphenation patterns,
>> even though I took care to specify that it is of \catcode 12 and of \lccode 
>> equal to itself…
> 
> Was that a typo? It would need to be \catcode 11 to be eligible for 
> hyphenation, IIRC.
> 
> (There may well be other issues involved, but let's start here.)
> 
>> And I did this in a file with no mapping at all.
>> Any idea where it comes from?
>> IMT Atlantique 
>> *Yannis HARALAMBOUS*
>> Professor
>> Computer Science Department
>> UMR CNRS 6285 Lab-STICC
>> Site web IMT Atlantique 
>> Twitter IMT Atlantique 
>> LinkedIn IMT Atlantique 
>> 
>> Technopôle Brest-Iroise CS 83818
>> 29238 Brest Cedex 3, France
>> Une école del'IMT 
>> /A good attitude to the preparation of written mathematical exposition
>> is to pretend that it is spoken. Pretend that you are explaining the subject
>> to a friend on a long walk in the woods, with no paper available./  
>> (Paul R. Halmos)
> 

 Yannis HARALAMBOUS
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
  
 
Technopôle
 Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école de l'IMT 
UNIX Book Units have all of the virtues and none of the vices of fitment 
furniture.
They are as modern as movies and as classical as Greece. (Brian Kernighan)





Re: [XeTeX] Follow-up to previous mail about hyphenation

2021-03-25 Thread Jonathan Kew

On 25/03/2021 09:27, Yannis Haralambous wrote:

I noticed that the problem does not come from mapping. When I write

0643200d200d0643

in the text (with \lefthyphenmin=1) and \lccode"200D="200D

and when I load a pattern

200d1200d

I get no hyphenation at all, as if the 200d character is not allowed 
in hyphenation patterns,
even though I took care to specify that it is of \catcode 12 and of 
\lccode equal to itself…


Was that a typo? It would need to be \catcode 11 to be eligible for 
hyphenation, IIRC.


(There may well be other issues involved, but let's start here.)



And I did this in a file with no mapping at all.

Any idea where it comes from?

IMT Atlantique  
*Yannis HARALAMBOUS*
Professor
Computer Science Department
UMR CNRS 6285 Lab-STICC
Site web IMT Atlantique 
Twitter IMT 
Atlantique LinkedIn IMT Atlantique 


Technopôle Brest-Iroise CS 83818
29238 Brest Cedex 3, France
Une école del'IMT 

/A good attitude to the preparation of written mathematical exposition
is to pretend that it is spoken. Pretend that you are explaining the subject
to a friend on a long walk in the woods, with no paper available./
  (Paul R. Halmos)