[CC: lualatex-...@tug.org
Please reply to tex-hyphen@tug.org]
Hi,
in the following, I'm only considering LuaTeX with UTF-8 encoded input.
When a ligature character, e.g., fi, is already present in the input
stream, LuaTeX won't hyphenate that word correctly.
\showhyphens{financial financial}
On Wed, Jan 15, 2014 at 6:26 PM, Stephan Hennig wrote:
[CC: lualatex-...@tug.org
Please reply to tex-hyphen@tug.org]
Hi,
in the following, I'm only considering LuaTeX with UTF-8 encoded input.
When a ligature character, e.g., fi, is already present in the input
stream, LuaTeX won't
However, when activating UK hyphenation patterns the word containing the
ligature is also hyphenated (code attached at the end).
This is LuaTeX, Version beta-0.76.0-2013120414 (rev 4627) (format=lualatex
2013.12.11) 15 JAN 2014 18:20
[...]
[][] \EU2/lmr/m/n/10 fin-an-cial fin-an-cial
\hsize 2mm
financial \par
financial \par
That's more or less what Stephan did. My point is that trying to
hyphenated nancial with the British patterns would explain why
financial is hyphenated the way it is -- OK, that's not exactly true,
one needs \lefthyphenmin=1 too.
Arthur
(not addressing anything but what might
be permissible hyphenations.)
On Wed, 15 Jan 2014, Mojca Miklavec wrote:
[...] On
top of that, imagine that there exists a word in a language where
hyphenation between f and i is allowed. If a dummy user provides
text with ligatures, there
Mojca Miklavec wrote:
If a dummy user provides
text with ligatures, there is no way to hyphenate that word properly.
Well, yes; but what if a real user does the same ?! :-)
But nancial isn't hyphenated independently.
Actually it is, since fi doesn't appear in any pattern. Hence
financial is hyphenated exactly the same way as Xnancial would, in
both British English and American English (or any other language for
which we have patterns, for that matter).
Am 15.01.2014 19:52, schrieb Stephan Hennig:
Alternative explanation: It could be that incidentally some pattern 1ni
or similar matches. Will investigate.
Indeed (but it's 1na).
Best regards,
Stephan Hennig
boundary letter: '.'
spot mins: 1 3
pattern file:
input shouldn't contain such ligatures and if it does, it might be with a
purpose
I cannot agree with the first part of that statement, although I do with
the second.
You should. Characters such as U+FB01 are deprecated and shouldn't be
used in text.
Arthur says
In American English Xnancial is hyphenated as X-nan-cial. In
British English it's hyphenated as Xn-an-cial. Hence ?-nan-cial and
?n-an-cial, respectively.
in u.s. english, no it's not, because
\lefthyphenmin=2 . to demonstrate:
*\showhyphens{iconography}
Because of the patterns, or because of \LHM \RHM ?
Because of the combination of all these settings. I just wanted to
demonstrate that you could hyphenate xnancial with the American
patterns.
Arthur
in u.s. english, no it's not, because
\lefthyphenmin=2 .
That's what it has been devised for, but by the same token you may
force the patterns to produce x-nan-cial, which is not an English word
anyway. That's what I was trying to say.
Arthur
Am 15.01.2014 19:30, schrieb Arthur Reutenauer:
My point is that trying to hyphenated nancial with the British
patterns would explain why financial is hyphenated the way it is --
OK, that's not exactly true, one needs \lefthyphenmin=1 too.
But nancial isn't hyphenated independently. Then the
Right, but if you have the same values of \LHM and \RHM
for Am.E and Br.E, do you still get the hyphenation
you adduced earlier ?
** Phil.
Arthur Reutenauer wrote:
Because of the patterns, or because of \LHM \RHM ?
Because of the combination of all these settings. I just wanted
Right, but if you have the same values of \LHM and \RHM
for Am.E and Br.E, do you still get the hyphenation
you adduced earlier ?
No, and it's not relevant. Again, xnancial isn't an English word,
and neither is financial (when encoded using the Unicode ligature, as
here). I was just showing
So, fi doesn't match any patterns,
but is there to satisfy \lefthyphenmin=1, and that's sufficient in this
case. Thanks all!
I'm glad that satisfies you. Now, do you understand that
1. It is a coincidence (a pretty common one, but a coincidence
Arthur Reutenauer wrote:
You should. Characters such as U+FB01 are deprecated and shouldn't be
used in text.
/Characters/ ... : yes. But consider a Unicode-in/Unicode-out
preprocessor; might it not generate fi in the output stream,
since it thinks it is generating glyphs, yet in a
Am 15.01.2014 19:18, schrieb Arthur Reutenauer:
However, when activating UK hyphenation patterns the word containing the
ligature is also hyphenated (code attached at the end).
This is LuaTeX, Version beta-0.76.0-2013120414 (rev 4627)
(format=lualatex 2013.12.11) 15 JAN 2014 18:20
On Wed, Jan 15, 2014 at 06:22:22PM +, Philip Taylor wrote:
Hans Hagen wrote:
input shouldn't contain such ligatures and if it does, it might be with a
purpose
I cannot agree with the first part of that statement, although I do with
the second. In a Unicode world (into which we
/Characters/ ... : yes. But consider a Unicode-in/Unicode-out
preprocessor; might it not generate fi in the output stream,
If it’s outputting characters, it shouldn’t generate fi or such like.
These characters have been introduced in Unicode for compatility with
existing standards (so as to
you were shewing the effects of the patterns /in combination
with values of \LHM and \RHM that were different for the two
dialects/ on those different non-words.
Yes, that's exactly what I wrote earlier.
Arthur
Am 15.01.2014 21:42, schrieb Arthur Reutenauer:
So, fi doesn't match any patterns,
but is there to satisfy \lefthyphenmin=1, and that's sufficient in this
case. Thanks all!
I'm glad that satisfies you.
Not me, but LuaTeX.
My reply to Mojca got
b...@ams.org wrote:
(and a hyphen that ought to be there,
before the n, is missing. into the
exception list that goes.)
/Before/ the n ? You say Aye-koh-nogg-rə-phee and
not Aye-konn-ogg-rə-fee ?
** Phil.
23 matches
Mail list logo