Yannis Haralambous wrote:
In fact, Arabic is not hyphenated.
That is presumably because of the existence of the /kashida/, Yanni.
What is interesting is that the W3C notes that the Arabic /script/ (as
opposed to the /language) /may be hyphenated, and offers Uyghur as
example —
When
In fact, Arabic is not hyphenated.
> Le 25 mars 2021 à 21:14, Philip Taylor a
> écrit :
>
> Not being an Arabist, I have no idea whether the output from the following is
> correct or not, but I thought that it would be fun to run Bruno's code on
> some 'real' Arabic rather than just on
Not being an Arabist, I have no idea whether the output from the
following is correct or not, but I thought that it would be fun to run
Bruno's code on some 'real' Arabic rather than just on sequences —
% !TeX Program = Ini-XeTeX
\let \dump = \relax
\input xelatex.ini
\begingroup
What is the reason for the \makeatletter at line~29, Bruno ? It appears
to behave identically without it.
--
/** Phil./
For those who want a complete working example based on Yiannis' code, here is
one, to be compiled with "xetex -ini -etex test.tex" and in which the arabic
"word" is hyphenated at every letter.
Thank you Yannis and others.
Bruno
On 3/25/21 6:50 PM, Yannis Haralambous wrote:
> Silly of me, when
Silly of me, when adding the \lccode information also in the TeX file… it works. And not only oncein a word, but for all hyphenation points. I was persuaded having done so, but apparently I didn't. Anyway,now everything suddenly seems to work. Totally
I posted a question on tex.stackexchange.com, let us see if somebody can give
us more input:
https://tex.stackexchange.com/questions/588952/zwj-not-working-in-hyphenation-patterns-in-xelatex
> Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit :
>
> Check \the\catcode"200D perhaps, it does not
Even when I reset \catcode to 11, the behavior is the same.Only Arabic words containing 200d are not hyphenated, others are hyphenated as expected.
test-idea.pdf
Description: Adobe PDF document
Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit :Check \the\catcode"200D
I checked it, \catcode"200D is 12
> Le 25 mars 2021 à 17:30, Bruno Le Floch a écrit :
>
> Check \the\catcode"200D perhaps, it does not seem to be set in your example
> document. Note that the LaTeX format might reset that catcode.
>
> The other mystery is why Arabic words seem to only be
Check \the\catcode"200D perhaps, it does not seem to be set in your example
document. Note that the LaTeX format might reset that catcode.
The other mystery is why Arabic words seem to only be hyphenated once.
On 3/25/21 5:17 PM, Yannis Haralambous wrote:
> Well it is neither polyglossia nor
Well it is neither polyglossia nor fontspec, because I have ran the following
And now, thanks to Jonathan, only one page —
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\# = 6
\catcode `\^ = 7
\catcode `\^^I = 10
\catcode `\% = 14
\let \bgroup = {
\let \egroup = }
\let \endgraf = \par
\tracingonline = 1
\tracinglostchars = 2
Jonathan Kew wrote:
I don't see what \dimen 10 is doing in any of this. It never seems to
get set, so I assume it remains zero.
Ah, I took this from plain.tex's definition of \maxdimen, but it is
possible (nay, probable, if not absolutely certain) that in the absence
of plain.tex being
I don't see what \dimen 10 is doing in any of this. It never seems to
get set, so I assume it remains zero.
If so, you have \vsize zero, so minimal pages must be expected.
On 25/03/2021 12:28, Philip Taylor wrote:
Jonathan Kew wrote:
See
I checked the \catcode of 200d, it is 12.
As for \XeTeXcharclass and \XeTeXinterchartoks they are used in bidi.sty but
not for 200d
I have found other pattern files where 200d is used, for example in Assamese:
\patterns{
% GENERAL RULE
% Do not break either side of ZERO-WIDTH JOINER
Jonathan Kew wrote:
See https://tug.org/pipermail/xetex/2014-January/025129.html
Thank you, Jonathan. Now formatted and included in the next iteration
(code below, with "showhyphens.tex" attached). I /believe/ that
"showhyphens.tex" is functionally equivalent to David's code, but as he
čt 25. 3. 2021 v 12:55 odesílatel Jonathan Kew napsal:
>
> On 25/03/2021 11:37, Yannis Haralambous wrote:
> > OK, this may be a bug, but it doesn't explain why in
> > polyglossia+fontspec I get no hyphenation at all. I should get at least
> > one hyphenation in each word, no?
>
> My guess is that
On 25/03/2021 11:37, Yannis Haralambous wrote:
OK, this may be a bug, but it doesn't explain why in
polyglossia+fontspec I get no hyphenation at all. I should get at least
one hyphenation in each word, no?
My guess is that polyglossia (or something else in latex?) thinks it
knows best how to
OK, this may be a bug, but it doesn't explain why in polyglossia+fontspec I get
no hyphenation at all. I should get at least one hyphenation in each word, no?
> Le 25 mars 2021 à 12:24, Jonathan Kew a écrit :
>
> The attached sample (to be run with "xetex -ini -etex") seems to show that
>
On 25/03/2021 11:24, Yannis Haralambous wrote:
Well if the text width is 0pt it will only break once, no?
Normally, I'd expect a break at every opportunity if the width is 0pt.
In plain (xe)tex,
\hsize 0pt
\noindent \hskip 0pt supercalifragilisticexpialidocious \par
gives me essentially
The attached sample (to be run with "xetex -ini -etex") seems to show
that it's possible to get this working, at least with the (old-ish)
xetex version I have on hand.
It does appear to only want to use the first available hyphenation
position in the Arabic "words"; offhand I'm not sure why
Well if the text width is 0pt it will only break once, no?
I begin to believe that in plain TeX 200d is treated just like any other
character,
so the problem lies not in XeTeX, but in polyglossia+fontspec somewhere there
is some
special treatment of 200d so that my LaTeX code doesn't
>> So what am I missing in my code, Jonathan ? I have added a \hyphenchar,
>> removed the pointless (and dysfunctional) \showhyphens), but still get no
>> hyphenation —
>>
>> % !TeX Program=Ini-XeTeX
>>
>> \catcode `\ = 10
>> \catcode `\ = 10
>> \catcode `\{ = 1
>> \catcode `\} = 2
>>
On 25/03/2021 10:52, Philip Taylor wrote:
Jonathan Kew wrote:
Indeed, that \showhyphens will not work with native opentype fonts.
This is a known difference between using TFM-based vs OT fonts; it's
been discussed (and an alternative shown) on the list somewhere in the
distant past.
There
Maybe the fact that it is the first word and hence is not preceded by a glue?
> Le 25 mars 2021 à 12:10, Philip Taylor a
> écrit :
>
> Jonathan Kew wrote:
>
>> The output you're getting shows that the hyphenation is in fact happening as
>> expected. (To get a hyphen rather than a box, you
Jonathan Kew wrote:
The output you're getting shows that the hyphenation is in fact
happening as expected. (To get a hyphen rather than a box, you need to
set the font's \hyphenchar appropriately.)
So what am I missing in my code, Jonathan ? I have added a \hyphenchar,
removed the
Jonathan Kew wrote:
Indeed, that \showhyphens will not work with native opentype fonts.
This is a known difference between using TFM-based vs OT fonts; it's
been discussed (and an alternative shown) on the list somewhere in the
distant past.
There is code, by Enrico Gregoria, for a
Here is my code:
\documentclass{article}
\usepackage{polyglossia,fontspec}
\setdefaultlanguage{arabic}
\newfontfamily{\arabicfont}[Script=Arabic,Extension=.ttf,Scale=1.2]{Amiri-Regular}
\textwidth1cm
\begin{document}
\large
\lefthyphenmin1
\righthyphenmin1
On 25/03/2021 10:22, Yannis Haralambous wrote:
When I run the same file with Amiri:
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"
\tenrm
\lccode "200D = "200D
\patterns {200d1200d}
Exactly the same as I do. But where does the U+ character come from?
And why isn't blabla hyphenated?
> Le 25 mars 2021 à 11:38, Philip Taylor a
> écrit :
>
>> \hbox(11.24+6.33998)x0.0, glue set - 1.0 []
>>
>> Missing character: There is no
>> (U+) in font [./Amiri-Regular!
>>
>>
Yannis Haralambous wrote:
No, I have the font in the same directory
Still doesn't work for me after copying the four ttfs there — instead,
if I don't want to use system font syntax, I have to use —
\font \tenrm = [./Amiri-Regular.ttf]
Then I get the following reports —
This is XeTeX,
Yannis Haralambous wrote:
No, I have the font in the same directory
OK, I installed it as a system font. I will try copying the .ttfs to
the source directory.
No, I have the font in the same directory
> Le 25 mars 2021 à 11:30, Philip Taylor a
> écrit :
>
> Yannis Haralambous wrote:
>> When I run the same file with Amiri:
>>
>> % !TeX Program=Ini-XeTeX
>>
>> \catcode `\ = 10
>> \catcode `\ = 10
>> \catcode `\{ = 1
>> \catcode `\} = 2
>>
Yannis Haralambous wrote:
When I run the same file with Amiri:
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"
Does your hacked version of my code not generate this error for you, Yanni ?
I get a missing character error for although I never request that char
Overfull \hbox (11.96289pt too wide) in paragraph at lines 34--34
[] \tenrm #1
\hbox(11.23047+6.34764)x0.0, glue set - 1.0 []
Missing character: There is no
(U+) in font [Amiri-Regular.ttf]
!
Overfull \hbox
Could you add the following two lines to the start of your file and check for
"Missing character" messages?
\tracingonline 1
\tracinglostchars 2
On 3/25/21 11:22 AM, Yannis Haralambous wrote:
> When I run the same file with Amiri:
>
> % !TeX Program=Ini-XeTeX
>
> \catcode `\ = 10
> \catcode
When I run the same file with Amiri:
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font\tenrm="Amiri-Regular.ttf"
\tenrm
\lccode "200D = "200D
\patterns {200d1200d}
\def \showhyphens
{
Philip Taylor wrote:
And I see no significant font using Amiri —
"no significant /*difference */...".
And I see no significant font using Amiri —
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font \tenrm = "Amiri"
\tenrm
\lccode "200D = "200D
\patterns {200d1200d}
\def \showhyphens
{
Not entirely clear why this generates two pages of output, but it does
seem to demonstrate the problem —
% !TeX Program=Ini-XeTeX
\catcode `\ = 10
\catcode `\ = 10
\catcode `\{ = 1
\catcode `\} = 2
\catcode `\^ = 7
\font \tenrm = cmr10
\tenrm
\lccode "200D = "200D
\patterns
On Thu, Mar 25, 2021 at 11:02:37AM +0100, Yannis Haralambous wrote:
> (I would have prepared a minimal plain TeX example with Amiri but I don't
> know neither how to load an OpenType font
\font\myfont="Font Name"
or
\font\myfont="[./path/to/font.ttf]"
Note the syntax: double
Hi Bruno,
when I test it with the Amiri font I get the same result, and Amiri contains a
glyph for the 200d character.
(I would have prepared a minimal plain TeX example with Amiri but I don't know
neither how to load an OpenType font
in plain TeX nor how to change writing direction)
> Le
Hello Yannis,
On 3/25/21 10:41 AM, Yannis Haralambous wrote:
> This is XeTeX, Version 3.141592653-2.6-0.93 (TeX Live 2021) (preloaded
> format=plain)
> restricted \write18 enabled.
> (./test.tex
> Underfull \hbox (badness 1) in paragraph at lines 6--6
> *[] \tenrm blabla*
> [1] )
> (see
Thanks for your reply Jonathan,
whether with catcode 12 or 11, it still doesn't work.
Here is a minimal example:
\lccode"200D="200D
\patterns{
200d1200d
}
and then a file
\lccode"200D="200D
\catcode"200D=11
bla200d200dbla
\showhyphens{bla200d200dbla}
\end
The log file
On 25/03/2021 09:27, Yannis Haralambous wrote:
I noticed that the problem does not come from mapping. When I write
0643200d200d0643
in the text (with \lefthyphenmin=1) and \lccode"200D="200D
and when I load a pattern
200d1200d
I get no hyphenation at all, as if the
45 matches
Mail list logo