Re: [luatex] searching a "marktobase" lookup example
> Looks good to me. One other tip is that you can fake a "continue" > statement with "goto", which will let you reduce the amount of > nested "if"s: [...] Thanks, I will remember this for future code. For my snippet, though, I think the current solution is legible enough. > Another suggestion is to use a more descriptive name for your > "luaotfload.patch_font" callback, since "patch-fonts" isn't the most > informative. "add-to-mark" would probably be a good name. Yeah, I've missed that. Will change locally. Werner
Re: [luatex] searching a "marktobase" lookup example
> Lua does have a "debug" module
>
> https://www.lua.org/manual/5.3/manual.html#6.10
>
> (that you can enable in LuaTeX with "--luadebug"), but I've never
> found it helpful for debugging. Usually I just manually add
> "print(var)" and/or "inspect(var)" calls to the source code.
Yeah :-)
I've come up with the attached solution, which works fine for the
small example. Not sure whether this is the correct way, though.
Werner
\documentclass{article}
\usepackage{luacode}
\usepackage{fontspec}
\begin{luacode}
local report = luaotfload.log.report
local patch_functions = {}
local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
report("both", 0, "add_to_mark",
"error: there is no glyph '%s' in font '%s'",
glyph, fontname)
end
return unicode
end
local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
"error: base glyph '%s' not in 'mark' feature of font '%s'",
base_glyph, fontname)
end
local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
"error: mark glyph '%s' not in 'mark' feature of font '%s'",
mark_glyph, fontname)
end
-- Add glyph with name ACC as a diacritic to an existing 'mark'
-- feature of font file FONT. The arguments BASE and MARK specify
-- names of glyphs that represent the desired anchor class that
-- connects base and mark glyphs, respectively, to which ACC should
-- be added. The coordinates X and Y give the position of the
-- anchor for ACC.
--
-- FONT should be the base name of an OpenType font, i.e., a file
-- name without a path (example: `foo.otf`).
--
-- BASE and MARK must exist in FONT, and there must be an entry in
-- the 'mark' feature that pairs them. ACC must exist in FONT, too.
--
-- This function can be used repeatedly. Note, however, that a mark
-- glyph can only be part of a single anchor class. As a
-- consequence, a second call to this function with the same
-- argument ACC that results in a different anchor class overrides
-- the result of the first call.
function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
patch_functions[font] = {}
end
local function patch_function(fontdata)
local fd = fontdata
local path = fd.specification.filename
local fn = file.basename(path)
local uni = fd.resources.unicodes
if not uni then
report("both", 0, "add_to_mark",
"error: 'unicodes' subtable missing;"
.. " cannot map glyph names to Unicode")
return
end
local u_acc = glyph_to_unicode(acc, uni, fn)
local u_base = glyph_to_unicode(base, uni, fn)
local u_mark = glyph_to_unicode(mark, uni, fn)
if not (u_acc and u_base and u_mark) then
return
end
local i = 1
while true do
local seq = fd.resources.sequences[i]
if not seq then
break
elseif (seq.type == "gpos_mark2base"
and seq.features["mark"]) then
local st = seq.steps
if st then
for j = 1, #st do
local cov = st[j].coverage
if cov then
local cov_mark = cov[u_mark]
if cov_mark then
local cov_base = cov_mark[1][u_base]
if cov_base then
local bc = st[j].baseclasses
for k = 1, #bc do
local bck = bc[k]
if bck[u_base] then
report("log", 0, "add_to_mark",
"found base-mark glyph combination '%s+%s'"
.. "in 'mark' feature of font '%s'",
base, mark, fn)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
"adding glyph '%s' to anchor class %d",
acc, k - 1)
cov[u_acc] = {bck, {x, y}}
return
end
end
else
missing_base_glyph(base, fn)
return
end
else
missing_mark_glyph(mark, fn)
return
end
end
end
end
end
i = i + 1
end
report("log", 0, "add_to_mark",
"no 'mark' feature in font '%s'",
fn)
return
end
table.insert(patch_functions[font], patch_function)
luatexbase.add_to_callback(
"luaotfload.patch_font",
function(fontdata)
local path = fontdata.specification.filename
Re: [luatex] searching a "marktobase" lookup example
> There's not enough context in the attached diff to actually see
> anything.
Oh, sorry, I thought it was clear that the diff is related to
`tfmdata.resources.sequences[50].steps[1].coverage[868]`.
> Can you try again but with "table.serialize" replaced with [...]
Thanks! [It took me a while to find out that I have to use the
`--luadebug` command-line option so that `debug.getinfo` is defined.]
The diff is attached (compressed this time, calling `diff` on sorted
input).
>> the tiny modification in the font's 'mark2base' lookup explodes,
>> causing four large, identical changes to `tfmdata` subtables.
>
> Lua tables are reference types, so those 4 subtables may actually be
> the same table, meaning that a single modification might change all
> of them.
Yes, it looks like that.
> Also, most of the subtables are unused, so you probably only need to
> change a single value anyways.
I don't think so. In `tfmdata`, the data from the 'mark2base' lookup
is no longer represented in a compact form. Instead, it is expanded
so that each accent glyph has all the necessary deltas for all base
glyphs. While this speeds up the processing of the 'mark2base'
lookup, it makes manipulation much more complicated. Of course I
could restrict the necessary changes of the diacritic for combinations
with base glyphs 'o' and 'u', say, omitting all other combinations,
but this is missing the point of what I want to do – for such a
restricted solution I could do what Hans has suggested, namely to
construct a simple 'kern' feature.
I think everything boils down to the question whether LuaTeX by
default loads a font internally with
```
f = fontloader.open("EBGaramond-Regular.otf")
fonttable = fontloader.to_table(f)
```
then converting `fonttable` to a `tfmdata` structure. I want to
access `fonttable` before this step happens. How can I do that?
>> In other words, it is unrealistic to modify `tfmdata` directly, and
>> I need to hook into LuaTeX's OpenType font handler one step
>> earlier, AFAICS. How can I do that?
>
> With the caveat that this depends on internal implementation
> details, and is therefore unsupported and could change at any time,
> "fonts.handlers.otf.readers.loadfont" is the earliest point that you
> can modify the font data: [...]
Thanks, but again, this code is manipulating `tfmdata` AFAICS, so
there is no advantage w.r.t. compactness of the 'mark2base' lookup
data.
Werner
ebgaramond.dump.diff.xz
Description: Binary data
Re: [luatex] searching a "marktobase" lookup example
>> With
>>
>> ```
>> f = fontloader.open("EBGaramond-Regular.otf")
>> fonttable = fontloader.to_table(f)
>> ```
>>
>> the OTF file modified as described contains the following data
>> added to `fonttable.glyphs[741]` (which is for glyph `uni0364`):
>>
>> ```
>> ["anchors"] = {
>> ["mark"] = {
>> ["Anchor-0"] = {
>> ["x"] = 115,
>> ["y"] = 440,
>> ["lig_index"] = 0 }
>> }
>> },
>> ["class"] = mark,
>> ```
>>
>> It seems that I have to do a brute-force approach by directly
>> modifying `fonttable`. My question is: How can I insert this data
>> to the `fonttable` structure whenever the (unmodified)
>> `EBGaramond-Regular.otf` gets loaded? Is there a hook for that?
>
> Something like the following should work (untested):
>
> local patch_functions = {}
>
> patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
> tfmdata.some_key = "some value"
> end
>
> luatexbase.add_to_callback(
> "luaotfload.patch_font",
> function(tfmdata, specification, font_id)
> local path = tfmdata.specification.filename
> local filename = file.basename(path)
> local patch_function = patch_functions[filename]
>
> if not patch_function then
> return
> end
> patch_function(tfmdata)
> end,
> "patch-fonts"
> )
Thanks! However, how can I access the `glyphs` table via `tfmdata`?
Or do I have to manipulate another structure? I looked into
`tfmdata.resources.sequences[50].steps[1].coverage[868]`, but this is
much more low-level and quite ugly to work with IMHO since I have to
explicitly compute the offsets between the base glyph and the
diacritic...
Werner
Re: [luatex] searching a "marktobase" lookup example
Hi Werner,
On Tue, 2025-09-09 at 05:16 +, Werner LEMBERG wrote:
> using the original and patched OpenType font produces the attached
> difference –
There's not enough context in the attached diff to actually see
anything. Can you try again but with "table.serialize" replaced with
"dot_inspect":
function dot_inspect(value, parents)
if not parents then
parents = {}
end
local literal = false
if type(value) == "table" then
local found = false
for k, v in pairs(value) do
local k = k
found = true
if type(k) ~= "string" then
k = "[" .. tostring(k) .. "]"
end
dot_inspect(v, table.imerged(parents, {k}))
end
if found then
return
else
value = "(empty table)"
literal = true
end
elseif type(value) == "function" then
local info = debug.getinfo(value)
value = info.source .. ":" .. info.linedefined
literal = true
end
if literal or (type(value) == "userdata") then
value = tostring(value)
else
value = string.format("%q", value)
end
value = table.concat(parents, ".") .. " = " .. value
print(value)
end
> the tiny modification in the font's 'mark2base' lookup
> explodes, causing four large, identical changes to `tfmdata`
> subtables.
Lua tables are reference types, so those 4 subtables may actually be the
same table, meaning that a single modification might change all of them.
This is just a guess though; I may be wrong here.
Also, most of the subtables are unused, so you probably only need to
change a single value anyways.
> In other words, it is unrealistic to modify `tfmdata`
> directly, and I need to hook into LuaTeX's OpenType font handler one
> step earlier, AFAICS. How can I do that?
With the caveat that this depends on internal implementation details,
and is therefore unsupported and could change at any time,
"fonts.handlers.otf.readers.loadfont" is the earliest point that you can
modify the font data:
\input{luaotfload.sty}
\directlua{
--[[ Force the font to be reloaded every time; only needed for testing.
]]
do
local saved = luaotfload.fontloader.containers.read
function luaotfload.fontloader.containers.read(...)
local tfmdata = saved(...)
if tfmdata and
file.basename(tfmdata.resources.filename) ==
"EBGaramond-Regular.otf"
then
tfmdata.tableversion = -1
end
return tfmdata
end
end
do
local saved = fonts.handlers.otf.readers.loadfont
function fonts.handlers.otf.readers.loadfont(...)
local tfmdata = saved(...)
--[[ Do something with the font data here. ]]
inspect {
["in"] = { ... },
["out"] = tfmdata,
}
return tfmdata
end
end
}
\font\ebgaramond={file:EBGaramond-Regular.otf} at 12pt
\ebgaramond Hello, world!
\bye
The table is actually quite a bit larger and more complicated than the
one in "luaotfload.patch_font", so this probably isn't very useful for
you.
Thanks,
-- Max
Re: [luatex] searching a "marktobase" lookup example
Hi Werner,
On Tue, 2025-09-16 at 08:02 +, Werner LEMBERG wrote:
> > Attached is an improved version.
>
> As usual: right after sending an improved version I find a problem :-)
>
> The old version only checked the first lookup of a 'mark' feature;
> this is now fixed.
Looks good to me. One other tip is that you can fake a "continue"
statement with "goto", which will let you reduce the amount of nested
"if"s:
for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if not coverage then
goto continue
end
local coverage_mark = coverage[uni_mark]
if not coverage_mark then
goto continue
end
mark_glyph_in_coverage = true
local coverage_base = coverage_mark[1][uni_base]
if not coverage_base then
goto continue
end
base_glyph_in_coverage = true
for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
report("log", 0, "add_to_mark",
"found base-mark glyph combination '%s+%s'"
.. " in 'mark' feature of font '%s'",
base, mark, filename)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
"adding glyph '%s' to anchor class %d",
acc, i - 1)
coverage[uni_acc] = {baseclass, {x, y}}
return
end
end
::continue::
end
Another suggestion is to use a more descriptive name for your
"luaotfload.patch_font" callback, since "patch-fonts" isn't the most
informative. "add-to-mark" would probably be a good name.
But otherwise, this looks great.
Thanks,
-- Max
Re: [luatex] searching a "marktobase" lookup example
> Attached is an improved version.
As usual: right after sending an improved version I find a problem :-)
The old version only checked the first lookup of a 'mark' feature;
this is now fixed.
Werner
% version 2025-Sep-16
\documentclass{article}
\usepackage{luacode}
\usepackage{fontspec}
\begin{luacode*}
local report = luaotfload.log.report
local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
report("both", 0, "add_to_mark",
"error: there is no glyph '%s' in font '%s'",
glyph, fontname)
end
return unicode
end
local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
"error: base glyph '%s' not in 'mark' feature of font '%s'",
base_glyph, fontname)
end
local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
"error: mark glyph '%s' not in 'mark' feature of font '%s'",
mark_glyph, fontname)
end
local patch_functions = {}
-- Add glyph with name ACC as a diacritic to an existing 'mark'
-- feature of font file FONT. The arguments BASE and MARK specify
-- names of glyphs that represent the desired anchor class that
-- connects base and mark glyphs, respectively, to which ACC should
-- be added. The coordinates X and Y give the position of the
-- anchor for ACC.
--
-- FONT should be the base name of an OpenType font, i.e., a file
-- name without a path (example: `foo.otf`).
--
-- BASE and MARK must exist in FONT, and there must be an entry in
-- the 'mark' feature that pairs them. ACC must exist in FONT, too.
--
-- This function can be used repeatedly. Note, however, that a mark
-- glyph can only be part of a single anchor class. As a
-- consequence, a second call to this function with the same
-- argument ACC (for a particular font) that results in a different
-- anchor class overrides the result of the first call.
function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
patch_functions[font] = {}
end
local function patch_function(fontdata)
local path = fontdata.specification.filename
local filename = file.basename(path)
local unicodes = fontdata.resources.unicodes
if not unicodes then
report("both", 0, "add_to_mark",
"error: 'unicodes' subtable missing;"
.. " cannot map glyph names to Unicode")
return
end
local uni_acc = glyph_to_unicode(acc, unicodes, filename)
local uni_base = glyph_to_unicode(base, unicodes, filename)
local uni_mark = glyph_to_unicode(mark, unicodes, filename)
if not (uni_acc and uni_base and uni_mark) then
return
end
local have_mark_feature = false
local base_glyph_in_coverage = false
local mark_glyph_in_coverage = false
for _, sequence in ipairs(fontdata.resources.sequences) do
if (sequence.type == "gpos_mark2base"
and sequence.features["mark"]) then
have_mark_feature = true
for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if coverage then
local coverage_mark = coverage[uni_mark]
if coverage_mark then
mark_glyph_in_coverage = true
local coverage_base = coverage_mark[1][uni_base]
if coverage_base then
base_glyph_in_coverage = true
for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
report("log", 0, "add_to_mark",
"found base-mark glyph combination '%s+%s'"
.. " in 'mark' feature of font '%s'",
base, mark, filename)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
"adding glyph '%s' to anchor class %d",
acc, i - 1)
coverage[uni_acc] = {baseclass, {x, y}}
return
end
end
end
end
end
end
end
end
if not have_mark_feature then
report("log", 0, "add_to_mark",
"no 'mark' feature in font '%s'",
filename)
else
if not base_glyph_in_coverage then
missing_base_glyph(base, filename)
end
if not mark_glyph_in_coverage then
missing_mark_glyph(mark, filename)
end
end
return
end
table.insert(patch_functions[font], patch_function)
luatexbase.add_to_callback(
"luaotfload.patch_font",
function(fontdata)
local path
Re: [luatex] searching a "marktobase" lookup example
>> I've come up with the attached solution, which works fine for the
>> small example. Not sure whether this is the correct way, though.
>
> A couple small comments: [...]
>
> Otherwise, it looks good to me.
Thanks a lot for the review! Attached is an improved version.
Werner
\documentclass{article}
\usepackage{luacode}
\usepackage{fontspec}
\begin{luacode*}
local report = luaotfload.log.report
local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
report("both", 0, "add_to_mark",
"error: there is no glyph '%s' in font '%s'",
glyph, fontname)
end
return unicode
end
local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
"error: base glyph '%s' not in 'mark' feature of font '%s'",
base_glyph, fontname)
end
local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
"error: mark glyph '%s' not in 'mark' feature of font '%s'",
mark_glyph, fontname)
end
local patch_functions = {}
-- Add glyph with name ACC as a diacritic to an existing 'mark'
-- feature of font file FONT. The arguments BASE and MARK specify
-- names of glyphs that represent the desired anchor class that
-- connects base and mark glyphs, respectively, to which ACC should
-- be added. The coordinates X and Y give the position of the
-- anchor for ACC.
--
-- FONT should be the base name of an OpenType font, i.e., a file
-- name without a path (example: `foo.otf`).
--
-- BASE and MARK must exist in FONT, and there must be an entry in
-- the 'mark' feature that pairs them. ACC must exist in FONT, too.
--
-- This function can be used repeatedly. Note, however, that a mark
-- glyph can only be part of a single anchor class. As a
-- consequence, a second call to this function with the same
-- argument ACC (for a particular font) that results in a different
-- anchor class overrides the result of the first call.
function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
patch_functions[font] = {}
end
local function patch_function(fontdata)
local path = fontdata.specification.filename
local filename = file.basename(path)
local unicodes = fontdata.resources.unicodes
if not unicodes then
report("both", 0, "add_to_mark",
"error: 'unicodes' subtable missing;"
.. " cannot map glyph names to Unicode")
return
end
local uni_acc = glyph_to_unicode(acc, unicodes, filename)
local uni_base = glyph_to_unicode(base, unicodes, filename)
local uni_mark = glyph_to_unicode(mark, unicodes, filename)
if not (uni_acc and uni_base and uni_mark) then
return
end
for _, sequence in ipairs(fontdata.resources.sequences) do
if (sequence.type == "gpos_mark2base"
and sequence.features["mark"]) then
for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if coverage then
local coverage_mark = coverage[uni_mark]
if coverage_mark then
local coverage_base = coverage_mark[1][uni_base]
if coverage_base then
for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
report("log", 0, "add_to_mark",
"found base-mark glyph combination '%s+%s'"
.. " in 'mark' feature of font '%s'",
base, mark, filename)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
"adding glyph '%s' to anchor class %d",
acc, i - 1)
coverage[uni_acc] = {baseclass, {x, y}}
return
end
end
else
missing_base_glyph(base, filename)
return
end
else
missing_mark_glyph(mark, filename)
return
end
end
end
end
end
report("log", 0, "add_to_mark",
"no 'mark' feature in font '%s'",
filename)
return
end
table.insert(patch_functions[font], patch_function)
luatexbase.add_to_callback(
"luaotfload.patch_font",
function(fontdata)
local path = fontdata.specification.filename
local filename = file.basename(path)
local patch_functions = patch_functions[filename]
if not patch_functions then
return
end
for _, v in pairs(patch_functions) do
v(fontda
Re: [luatex] searching a "marktobase" lookup example
Hi Werner, On Mon, 2025-09-15 at 20:53 +, Werner LEMBERG wrote: > I've come up with the attached solution, which works fine for the > small example. Not sure whether this is the correct way, though. A couple small comments: 1. You usually want "luacode*" instead of just "luacode", since only the "*" environment makes "\" have "other" catcodes. 2. You can freely change the function parameter names, so you can just do local function patch_function(fd) instead of local function patch_function(fontdata) local fd = fontdata 3. Lua lets you loop over a table directly, so instead of local i = 1 while true do local seq = fd.resources.sequences[i] local st = seq.steps for j = 1, #st do local cov = st[j].coverage local bc = st[j].baseclasses for k = 1, #bc do local bck = bc[k] some_function(bck) end end i = i + 1 end you can do for i, seq in ipairs(fd.resources.sequences[i]) for j, step in ipairs(seq.steps) do for k, bck in ipairs(step.baseclasses) do some_function(bck) end end end Otherwise, it looks good to me. Thanks, -- Max
Re: [luatex] searching a "marktobase" lookup example
Hi Werner,
On Sat, 2025-09-13 at 10:30 +, Werner LEMBERG wrote:
> > The current Lua font loading code essentially goes directly from the
> > binary font files to the "tfmdata" table. Specifically, mark-to-base
> > is handled by lines 1906--1908 of "fontloader-font-dsp.lua".
>
> Thanks again, very helpful. BTW, how could I debug this code? What's
> the right way to use, say, `debugger.lua` while loading a font?
Lua does have a "debug" module
https://www.lua.org/manual/5.3/manual.html#6.10
(that you can enable in LuaTeX with "--luadebug"), but I've never found
it helpful for debugging. Usually I just manually add "print(var)"
and/or "inspect(var)" calls to the source code. You can use
"debug.traceback()" without "--luadebug", so sometimes
"print(debug.traceback())" can be helpful. Another common trick to trace
a function is the following:
-- Assuming that a function "some_function" exists
do
local saved = some_function
function some_function(...)
local out = { saved(...) }
inspect {
["in"] = { ... },
out = out,
}
return unpack(out)
end
end
The ConTeXt fonts manual ("texdoc fonts-mkiv") and the CLD manual
("texdoc cld") might also be useful; although they only claim to
describe ConTeXt, 90% of both manuals apply to LuaLaTeX as well.
Also, luaotfload only loads the "fontloader-2023-12-28.lua" file, but it
does this at runtime, so you shouldn't need to rebuild the formats for
testing. However, this file has all the comments and indentation
stripped, so it's fairly challenging to read. So it's best to read the
"$TEXMFDIST/tex/luatex/luaotfload/fontloader-font-*.lua" files to figure
out what's going on, but then apply any changes to
"fontloader-2023-12-28.lua" directly.
Also, depending on what you're doing, the caches might cause problems,
so if something isn't working right, delete
"$TEXMFVAR/luatex-cache/generic/".
Thanks,
-- Max
Re: [luatex] searching a "marktobase" lookup example
> [... you should be safe to ignore anything with "rawdata" or > "unscaled" in its name, which just leaves the > "resources.sequences[i].steps[1].coverage" stuff. OK, thanks. > The current Lua font loading code essentially goes directly from the > binary font files to the "tfmdata" table. Specifically, mark-to-base > is handled by lines 1906--1908 of "fontloader-font-dsp.lua". Thanks again, very helpful. BTW, how could I debug this code? What's the right way to use, say, `debugger.lua` while loading a font? Werner
Re: [luatex] searching a "marktobase" lookup example
Hello Hans, > Also, one can - as I posted before - apply some positioning feature, > so I don't see what use a mark related extension one would add even > if I would do that just because it's easy. I repeat: The change in the 'mark2base' table is just a few lines – I have to add a single anchor position for the diacritic and to add this diacritic to the lookup's coverage – six lines of code or so in total in the `glyphs` table. In `tfmdata`, as shown in the diff attached to a previous mail, this corresponds to 1670(!) lines of code. It is this enormous discrepancy that irks me. This is *not* a criticism of LuaTeX. I fully understand why `tfmdata` is constructed the way it is. It's just frustrating that there isn't a simple and elegant solution to get the desired effect. I started with reading Paul Isambert's excellent TUGboat article, not being aware that the described machinery (involving the `glyphs` table) isn't (any longer?) actively used for loading fonts in luatex (and LuaLaTeX in particular). > Also, one can - as I posted before - apply some positioning feature, You can do that for selected pairs, but to do that *in general*, as the 'mark2base' feature does, you need many, many such pairs, which I consider not elegant. > so I don't see what use a mark related extension one would add even > if I would do that just because it's easy. It's only easy if you want to handle selected pairs. > Maybe if someone asked on the context list I'd bother. Well, I'm not using ConTeXt... Anyway, I'll try to find a programmatic solution based on `tfmdata`: * Reconstruct the `mark2base` coverage table. * Reconstruct a table of top anchor points of all related base characters. * Construct a proper entry for `tfmdata` based on those two tables. Werner
Re: [luatex] searching a "marktobase" lookup example
Hi Werner,
On Wed, 2025-09-10 at 06:00 +, Werner LEMBERG wrote:
> > Can you try again but with "table.serialize" replaced with [...]
>
> Thanks! [It took me a while to find out that I have to use the
> `--luadebug` command-line option so that `debug.getinfo` is defined.]
Whoops, sorry about that.
> The diff is attached (compressed this time, calling `diff` on sorted
> input).
Ok, you should be safe to ignore anything with "rawdata" or "unscaled"
in its name, which just leaves the
"resources.sequences[i].steps[1].coverage" stuff.
> I think everything boils down to the question whether LuaTeX by
> default loads a font internally with
>
> ```
> f = fontloader.open("EBGaramond-Regular.otf")
> fonttable = fontloader.to_table(f)
> ```
No, nothing (that I'm aware of) uses the builtin "fontloader" library;
the current parser is 100% Lua.
> then converting `fonttable` to a `tfmdata` structure. I want to
> access `fonttable` before this step happens. How can I do that?
The current Lua font loading code essentially goes directly from the
binary font files to the "tfmdata" table. Specifically, mark-to-base is
handled by lines 1906--1908 of "fontloader-font-dsp.lua".
> > With the caveat that this depends on internal implementation
> > details, and is therefore unsupported and could change at any time,
> > "fonts.handlers.otf.readers.loadfont" is the earliest point that you
> > can modify the font data: [...]
>
> Thanks, but again, this code is manipulating `tfmdata` AFAICS, so
> there is no advantage w.r.t. compactness of the 'mark2base' lookup
> data.
If you look at the function that "fonts.handlers.otf.readers.loadfont"
call internally ("loadfontdata", "fontloader-font-otr.lua" lines
2240--2306), you can see that the function is parsing the font data
directly ("readulong" and any of the "*cardinal*" functions are for
parsing binary data), so there's really nowhere earlier that you can
hook into.
> I don't think so. In `tfmdata`, the data from the 'mark2base' lookup
> is no longer represented in a compact form. Instead, it is expanded
> so that each accent glyph has all the necessary deltas for all base
> glyphs. While this speeds up the processing of the 'mark2base'
> lookup, it makes manipulation much more complicated.
Internally, the code directly adds the data in the expanded form by
looping over the characters. So since there aren't any helper functions,
you'll have to loop over all the characters yourself. Which I agree is
annoying, but even if Hans were to add a helper function for this, the
LaTeX team has stopped importing new font loader code, so you wouldn't
actually be able to use it.
Thanks,
-- Max
Re: [luatex] searching a "marktobase" lookup example
On 9/10/2025 9:41 AM, Max Chernoff via luatex wrote: Internally, the code directly adds the data in the expanded form by looping over the characters. So since there aren't any helper functions, you'll have to loop over all the characters yourself. Which I agree is annoying, but even if Hans were to add a helper function for this, the LaTeX team has stopped importing new font loader code, so you wouldn't actually be able to use it. I'm not sure what you mean with 'expanded' here but your last sentence is a good reason for me to not waste time explaining what really happens deep down and why en gets mentioned tables. Also, one can - as I posted before - apply some positioning feature, so I don't see what use a mark related extension one would add even if I would do that just because it's easy. Maybe if someone asked on the context list I'd bother. Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -
Re: [luatex] searching a "marktobase" lookup example
> Thanks! However, how can I access the `glyphs` table via `tfmdata`?
> Or do I have to manipulate another structure? I looked into
> `tfmdata.resources.sequences[50].steps[1].coverage[868]`, but this
> is much more low-level and quite ugly to work with IMHO since I have
> to explicitly compute the offsets between the base glyph and the
> diacritic...
To give more details: what I would like to add, as mentioned earlier,
is this for glyph 'uni0364' in the `glyphs` table, which is the
complete difference between the original and patched 'EB Garamond'
font file.
```
anchors = {
mark = {
['Anchor-0'] = {
x = 115,
y = 440,
lig_index = 0,
}
}
}
class = mark
```
However, calling the following LaTeX file to dump the `tfmdata` table
```
\documentclass{article}
\usepackage{fontspec}
\directlua{
local patch_functions = {}
patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
local data = table.serialize(tfmdata)
io.savedata("ebgaramond.dump", data)
end
luatexbase.add_to_callback(
"luaotfload.patch_font",
function(tfmdata, specification, font_id)
local path = tfmdata.specification.filename
local filename = file.basename(path)
local patch_function = patch_functions[filename]
if not patch_function then
return
end
patch_function(tfmdata)
end,
"patch-fonts"
)
}
\setmainfont{EB Garamond}
\begin{document}
schoͤn
\end{document}
```
using the original and patched OpenType font produces the attached
difference – the tiny modification in the font's 'mark2base' lookup
explodes, causing four large, identical changes to `tfmdata`
subtables. In other words, it is unrealistic to modify `tfmdata`
directly, and I need to hook into LuaTeX's OpenType font handler one
step earlier, AFAICS. How can I do that?
Werner
--- ebgaramond.dump 2025-09-09 06:55:06.565403863 +0200
+++ ebgaramond.dump.fixed 2025-09-09 06:51:49.856353700 +0200
@@ -189845,6 +189845,848 @@
},
{ -63, 440 },
},
+ [868]={
+{
+ [71]={ 403, 675 },
+ [97]={ 201, 440 },
+ [99]={ 237, 440 },
+ [101]={ 222, 440 },
+ [103]={ 215, 440 },
+ [105]={ 124, 440 },
+ [106]={ 116, 440 },
+ [109]={ 392, 440 },
+ [110]={ 271, 440 },
+ [111]={ 254, 440 },
+ [112]={ 263, 440 },
+ [114]={ 191, 440 },
+ [115]={ 176, 440 },
+ [116]={ 136, 440 },
+ [117]={ 252, 440 },
+ [118]={ 218, 440 },
+ [119]={ 343, 440 },
+ [120]={ 237, 440 },
+ [121]={ 242, 440 },
+ [122]={ 201, 440 },
+ [224]={ 201, 440 },
+ [225]={ 201, 440 },
+ [226]={ 201, 440 },
+ [227]={ 201, 440 },
+ [228]={ 201, 440 },
+ [229]={ 201, 440 },
+ [230]={ 302, 440 },
+ [231]={ 237, 440 },
+ [232]={ 222, 440 },
+ [233]={ 222, 440 },
+ [234]={ 222, 440 },
+ [235]={ 222, 440 },
+ [236]={ 124, 440 },
+ [237]={ 124, 440 },
+ [238]={ 124, 440 },
+ [239]={ 124, 440 },
+ [241]={ 271, 440 },
+ [242]={ 254, 440 },
+ [243]={ 254, 440 },
+ [244]={ 254, 440 },
+ [245]={ 254, 440 },
+ [246]={ 254, 440 },
+ [248]={ 254, 440 },
+ [249]={ 252, 440 },
+ [250]={ 252, 440 },
+ [251]={ 252, 440 },
+ [252]={ 252, 440 },
+ [253]={ 242, 440 },
+ [255]={ 242, 440 },
+ [257]={ 201, 440 },
+ [259]={ 201, 440 },
+ [261]={ 201, 440 },
+ [263]={ 237, 440 },
+ [265]={ 237, 440 },
+ [267]={ 237, 440 },
+ [269]={ 237, 440 },
+ [275]={ 222, 440 },
+ [277]={ 222, 440 },
+ [279]={ 222, 440 },
+ [283]={ 222, 440 },
+ [284]={ 403, 675 },
+ [285]={ 215, 440 },
+ [286]={ 403, 675 },
+ [287]={ 215, 440 },
+ [288]={ 403, 675 },
+ [289]={ 215, 440 },
+ [290]={ 403, 675 },
+ [291]={ 215, 440 },
+ [297]={ 124, 440 },
+ [299]={ 124, 440 },
+ [301]={ 124, 440 },
+ [303]={ 124, 440 },
+ [305]={ 124, 440 },
+ [309]={ 116, 440 },
+ [324]={ 271, 440 },
+ [326]={ 271, 440 },
+ [328]={ 271, 440 },
+ [329]={ 352, 440 },
+ [333]={ 254, 440 },
+ [335]={ 254, 440 },
+ [337]={ 254, 440 },
+ [341]={ 191, 440 },
+ [343]={ 191, 440 },
+ [345]={ 191, 440 },
+ [347]={ 176, 440 },
+ [349]={ 176, 440 },
+ [351]={ 176, 440 },
+ [353]={ 176, 440 },
+ [355]={ 136, 440 },
+ [357]={ 136, 440 },
+ [359]={ 147, 440 },
+ [361]={ 252, 440 },
+ [363]={ 252, 440 },
+ [365]={ 252, 440 },
+ [367]={ 252, 440 },
+ [369]={ 252, 440 },
+ [371]={ 252, 440 },
+ [373]={ 343
Re: [luatex] searching a "marktobase" lookup example
Hi Werner,
On Mon, 2025-09-08 at 06:02 +, Werner LEMBERG wrote:
> It seems that I have to do a brute-force approach by directly
> modifying `fonttable`. My question is: How can I insert this data to
> the `fonttable` structure whenever the (unmodified)
> `EBGaramond-Regular.otf` gets loaded? Is there a hook for that?
Something like the following should work (untested):
local patch_functions = {}
patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
tfmdata.some_key = "some value"
end
luatexbase.add_to_callback(
"luaotfload.patch_font",
function(tfmdata, specification, font_id)
local path = tfmdata.specification.filename
local filename = file.basename(path)
local patch_function = patch_functions[filename]
if not patch_function then
return
end
patch_function(tfmdata)
end,
"patch-fonts"
)
Thanks,
-- Max
Re: [luatex] searching a "marktobase" lookup example
> BTW, it's very easy to patch the original `EBGaramond-Regular.otf`
> file using the XML dump as produced by the `ttx` font
> compiler/decompiler: For the 'mark' lookup you add glyph 'uni0364'
> to its 'MarkCoverage' table (as the 28th entry counting from zero),
> then adding an entry in the 'MarkArray' table:
>
> ```
>
>
>
>
>
>
>
> ```
>
> However, I consider this as a last-resort solution that I would like
> to avoid.
>
> So I ask again: Is there a solution to construct a tiny 'marktobase'
> feature with `fonts.handlers.otf.addfeature` (or something else)?
I had a closer look into the files of 'luaotfload', and my conclusion
is that there is no 'fonts.handlers.otf.addfeature' support for the
GPOS `mark2base` lookup type at all. Sigh.
With
```
f = fontloader.open("EBGaramond-Regular.otf")
fonttable = fontloader.to_table(f)
```
the OTF file modified as described contains the following data added
to `fonttable.glyphs[741]` (which is for glyph `uni0364`):
```
["anchors"] = {
["mark"] = {
["Anchor-0"] = {
["x"] = 115,
["y"] = 440,
["lig_index"] = 0 }
}
},
["class"] = mark,
```
It seems that I have to do a brute-force approach by directly
modifying `fonttable`. My question is: How can I insert this data to
the `fonttable` structure whenever the (unmodified)
`EBGaramond-Regular.otf` gets loaded? Is there a hook for that?
Werner
Re: [luatex] searching a "marktobase" lookup example
> As a not-font-feature solution, I have found this [...] Thanks! Werner
Re: [luatex] searching a "marktobase" lookup example
On 9/1/2025 2:09 PM, Werner LEMBERG wrote:
I have a problem with the EB Garamond family: it contains the glyph
"ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
neither an anchor point nor is it part of the font's "mark" table –
in particular, I need support for the combination "oͤ", as used in
old German texts.
To fix this, I would like to use `fonts.handlers.otf.addfeature`.
Is there an example somewhere how to do that? I could only find
samples for other, simpler GSUB and GPOS lookup types but nothing
for "marktobase", which is needed here. In case there is
documentation already available please give me a link.
it should be helpful if you add small example (even if it does not
do what you need) to have a bit of context.
Here it is, using current git of TeXLive. The attached images show
the current and desired results.
```
\documentclass{article}
\usepackage{ebgaramond}
\begin{document}
gehoͤrt
\end{document}
```
Perhaps the pair feature as in
https://articles.contextgarden.net/journal/2017/27-76.pdf
I've seen this already, thanks, but the structure of the 'marktobase'
feature is completely different.
attached is how i'd do it in context ...
\startluacode
fonts.handlers.otf.addfeature {
name = "kern",
type = "pair",
data = {
["o"] = { [0x364] = { false, { -150, 0, 0, 0 } } },
}
}
\stopluacode
\setupbodyfont[ebgaramond]
\startTEXpage[offset=1TS]
mswoͤrd
\stopTEXpage
... i have no clue if it works out in latex
Hans
-
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-\startluacode
fonts.handlers.otf.addfeature {
name = "kern",
type = "pair",
data = {
["o"] = { [0x364] = { false, { -150, 0, 0, 0 } } },
}
}
\stopluacode
\setupbodyfont[ebgaramond]
\startTEXpage[offset=1TS]
mswoͤrd
\stopTEXpage
extensions-012.pdf
Description: Adobe PDF document
Re: [luatex] searching a "marktobase" lookup example
> attached is how i'd do it in context ... [...] Thanks, this solution also came to my mind, and it certainly works, see https://tex.stackexchange.com/questions/312154/how-to-adjust-font-features-in-luatex which has quite a comprehensive description of some LuaTeX font features. However, I consider 'kern' not as a correct replacement for a 'marktobase' feature, since the latter makes U+0364 work with all base characters defined in the 'mark' feature. In particular, AFAICS, your sulution fails for uppercase combinations like "Oͤ" because standard kerning doesn't have vertical offsets. BTW, it's very easy to patch the original `EBGaramond-Regular.otf` file using the XML dump as produced by the `ttx` font compiler/decompiler: For the 'mark' lookup you add glyph 'uni0364' to its 'MarkCoverage' table (as the 28th entry counting from zero), then adding an entry in the 'MarkArray' table: ``` ``` However, I consider this as a last-resort solution that I would like to avoid. So I ask again: Is there a solution to construct a tiny 'marktobase' feature with `fonts.handlers.otf.addfeature` (or something else)? Werner
Re: [luatex] searching a "marktobase" lookup example
On Mon, 1 Sept 2025 at 14:10, Werner LEMBERG wrote:
>
> >> I have a problem with the EB Garamond family: it contains the glyph
> >> "ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
> >> neither an anchor point nor is it part of the font's "mark" table –
> >> in particular, I need support for the combination "oͤ", as used in
> >> old German texts.
> >>
> >> To fix this, I would like to use `fonts.handlers.otf.addfeature`.
> >> Is there an example somewhere how to do that? I could only find
> >> samples for other, simpler GSUB and GPOS lookup types but nothing
> >> for "marktobase", which is needed here. In case there is
> >> documentation already available please give me a link.
> >
> > it should be helpful if you add small example (even if it does not
> > do what you need) to have a bit of context.
>
> Here it is, using current git of TeXLive. The attached images show
> the current and desired results.
>
> ```
> \documentclass{article}
>
> \usepackage{ebgaramond}
>
> \begin{document}
> gehoͤrt
> \end{document}
> ```
>
> > Perhaps the pair feature as in
> > https://articles.contextgarden.net/journal/2017/27-76.pdf
>
> I've seen this already, thanks, but the structure of the 'marktobase'
> feature is completely different.
>
>
> Werner
>
As a not-font-feature solution, I have found this
https://tex.stackexchange.com/questions/694650/increase-size-of-superscript-letter-diacritics
(method=pdfstringdef; I guess that one needs to to fix the macro \foo for
a correct the actualtext gehoͤrt)
See
https://ctan.org/pkg/accsupp
and
https://latex3.github.io/tagging-project/tagging-status/
for the accsupp status.
--
luigi
Re: [luatex] searching a "marktobase" lookup example
>> I have a problem with the EB Garamond family: it contains the glyph
>> "ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
>> neither an anchor point nor is it part of the font's "mark" table –
>> in particular, I need support for the combination "oͤ", as used in
>> old German texts.
>>
>> To fix this, I would like to use `fonts.handlers.otf.addfeature`.
>> Is there an example somewhere how to do that? I could only find
>> samples for other, simpler GSUB and GPOS lookup types but nothing
>> for "marktobase", which is needed here. In case there is
>> documentation already available please give me a link.
>
> it should be helpful if you add small example (even if it does not
> do what you need) to have a bit of context.
Here it is, using current git of TeXLive. The attached images show
the current and desired results.
```
\documentclass{article}
\usepackage{ebgaramond}
\begin{document}
gehoͤrt
\end{document}
```
> Perhaps the pair feature as in
> https://articles.contextgarden.net/journal/2017/27-76.pdf
I've seen this already, thanks, but the structure of the 'marktobase'
feature is completely different.
Werner
