Re: [luatex] searching a "marktobase" lookup example

2025-09-21 Thread Werner LEMBERG


> Looks good to me. One other tip is that you can fake a "continue"
> statement with "goto", which will let you reduce the amount of
> nested "if"s: [...]

Thanks, I will remember this for future code.  For my snippet, though,
I think the current solution is legible enough.

> Another suggestion is to use a more descriptive name for your
> "luaotfload.patch_font" callback, since "patch-fonts" isn't the most
> informative. "add-to-mark" would probably be a good name.

Yeah, I've missed that.  Will change locally.


Werner


Re: [luatex] searching a "marktobase" lookup example

2025-09-20 Thread Werner LEMBERG

> Lua does have a "debug" module
> 
> https://www.lua.org/manual/5.3/manual.html#6.10
> 
> (that you can enable in LuaTeX with "--luadebug"), but I've never
> found it helpful for debugging. Usually I just manually add
> "print(var)" and/or "inspect(var)" calls to the source code.

Yeah :-)

I've come up with the attached solution, which works fine for the
small example.  Not sure whether this is the correct way, though.


Werner
\documentclass{article}

\usepackage{luacode}
\usepackage{fontspec}

\begin{luacode}
  local report = luaotfload.log.report

  local patch_functions = {}

  local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
  report("both", 0, "add_to_mark",
 "error: there is no glyph '%s' in font '%s'",
 glyph, fontname)
end
return unicode
  end

  local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: base glyph '%s' not in 'mark' feature of font '%s'",
   base_glyph, fontname)
  end

  local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: mark glyph '%s' not in 'mark' feature of font '%s'",
   mark_glyph, fontname)
  end


  -- Add glyph with name ACC as a diacritic to an existing 'mark'
  -- feature of font file FONT.  The arguments BASE and MARK specify
  -- names of glyphs that represent the desired anchor class that
  -- connects base and mark glyphs, respectively, to which ACC should
  -- be added.  The coordinates X and Y give the position of the
  -- anchor for ACC.
  --
  -- FONT should be the base name of an OpenType font, i.e., a file
  -- name without a path (example: `foo.otf`).
  --
  -- BASE and MARK must exist in FONT, and there must be an entry in
  -- the 'mark' feature that pairs them.  ACC must exist in FONT, too.
  --
  -- This function can be used repeatedly.  Note, however, that a mark
  -- glyph can only be part of a single anchor class.  As a
  -- consequence, a second call to this function with the same
  -- argument ACC that results in a different anchor class overrides
  -- the result of the first call.
  function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
  patch_functions[font] = {}
end

local function patch_function(fontdata)
  local fd = fontdata
  local path = fd.specification.filename
  local fn = file.basename(path)

  local uni = fd.resources.unicodes
  if not uni then
report("both", 0, "add_to_mark",
   "error: 'unicodes' subtable missing;"
   .. " cannot map glyph names to Unicode")
return
  end

  local u_acc = glyph_to_unicode(acc, uni, fn)
  local u_base = glyph_to_unicode(base, uni, fn)
  local u_mark = glyph_to_unicode(mark, uni, fn)
  if not (u_acc and u_base and u_mark) then
return
  end

  local i = 1
  while true do
local seq = fd.resources.sequences[i]
if not seq then
  break
elseif (seq.type == "gpos_mark2base"
and seq.features["mark"]) then
  local st = seq.steps
  if st then
for j = 1, #st do
  local cov = st[j].coverage
  if cov then
local cov_mark = cov[u_mark]
if cov_mark then
  local cov_base = cov_mark[1][u_base]
  if cov_base then
local bc = st[j].baseclasses
for k = 1, #bc do
  local bck = bc[k]
  if bck[u_base] then
report("log", 0, "add_to_mark",
   "found base-mark glyph combination '%s+%s'"
   .. "in 'mark' feature of font '%s'",
   base, mark, fn)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
   "adding glyph '%s' to anchor class %d",
   acc, k - 1)
cov[u_acc] = {bck, {x, y}}
return
  end
end
  else
missing_base_glyph(base, fn)
return
  end
else
  missing_mark_glyph(mark, fn)
  return
end
  end
end
  end
end
i = i + 1
  end

  report("log", 0, "add_to_mark",
 "no 'mark' feature in font '%s'",
 fn)
  return
end

table.insert(patch_functions[font], patch_function)

luatexbase.add_to_callback(
  "luaotfload.patch_font",
  function(fontdata)
local path = fontdata.specification.filename

Re: [luatex] searching a "marktobase" lookup example

2025-09-18 Thread Werner LEMBERG

> There's not enough context in the attached diff to actually see
> anything.

Oh, sorry, I thought it was clear that the diff is related to
`tfmdata.resources.sequences[50].steps[1].coverage[868]`.

> Can you try again but with "table.serialize" replaced with [...]

Thanks!  [It took me a while to find out that I have to use the
`--luadebug` command-line option so that `debug.getinfo` is defined.]

The diff is attached (compressed this time, calling `diff` on sorted
input).

>> the tiny modification in the font's 'mark2base' lookup explodes,
>> causing four large, identical changes to `tfmdata` subtables.
> 
> Lua tables are reference types, so those 4 subtables may actually be
> the same table, meaning that a single modification might change all
> of them.

Yes, it looks like that.

> Also, most of the subtables are unused, so you probably only need to
> change a single value anyways.

I don't think so.  In `tfmdata`, the data from the 'mark2base' lookup
is no longer represented in a compact form.  Instead, it is expanded
so that each accent glyph has all the necessary deltas for all base
glyphs.  While this speeds up the processing of the 'mark2base'
lookup, it makes manipulation much more complicated.  Of course I
could restrict the necessary changes of the diacritic for combinations
with base glyphs 'o' and 'u', say, omitting all other combinations,
but this is missing the point of what I want to do – for such a
restricted solution I could do what Hans has suggested, namely to
construct a simple 'kern' feature.

I think everything boils down to the question whether LuaTeX by
default loads a font internally with

```
f = fontloader.open("EBGaramond-Regular.otf")
fonttable = fontloader.to_table(f)
```

then converting `fonttable` to a `tfmdata` structure.  I want to
access `fonttable` before this step happens.  How can I do that?

>> In other words, it is unrealistic to modify `tfmdata` directly, and
>> I need to hook into LuaTeX's OpenType font handler one step
>> earlier, AFAICS.  How can I do that?
> 
> With the caveat that this depends on internal implementation
> details, and is therefore unsupported and could change at any time,
> "fonts.handlers.otf.readers.loadfont" is the earliest point that you
> can modify the font data: [...]

Thanks, but again, this code is manipulating `tfmdata` AFAICS, so
there is no advantage w.r.t. compactness of the 'mark2base' lookup
data.


Werner


ebgaramond.dump.diff.xz
Description: Binary data


Re: [luatex] searching a "marktobase" lookup example

2025-09-17 Thread Werner LEMBERG


>> With
>> 
>> ```
>> f = fontloader.open("EBGaramond-Regular.otf")
>> fonttable = fontloader.to_table(f)
>> ```
>> 
>> the OTF file modified as described contains the following data
>> added to `fonttable.glyphs[741]` (which is for glyph `uni0364`):
>> 
>> ```
>> ["anchors"] = {
>>   ["mark"] = {
>> ["Anchor-0"] = {
>>   ["x"] = 115,
>>   ["y"] = 440,
>>   ["lig_index"] = 0 }
>>   }
>> },
>> ["class"] = mark,
>> ```
>>
>> It seems that I have to do a brute-force approach by directly
>> modifying `fonttable`.  My question is: How can I insert this data
>> to the `fonttable` structure whenever the (unmodified)
>> `EBGaramond-Regular.otf` gets loaded?  Is there a hook for that?
> 
> Something like the following should work (untested):
> 
> local patch_functions = {}
> 
> patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
> tfmdata.some_key = "some value"
> end
> 
> luatexbase.add_to_callback(
> "luaotfload.patch_font",
> function(tfmdata, specification, font_id)
> local path = tfmdata.specification.filename
> local filename = file.basename(path)
> local patch_function = patch_functions[filename]
> 
> if not patch_function then
> return
> end
> patch_function(tfmdata)
> end,
> "patch-fonts"
> )

Thanks!  However, how can I access the `glyphs` table via `tfmdata`?
Or do I have to manipulate another structure?  I looked into
`tfmdata.resources.sequences[50].steps[1].coverage[868]`, but this is
much more low-level and quite ugly to work with IMHO since I have to
explicitly compute the offsets between the base glyph and the
diacritic...


Werner


Re: [luatex] searching a "marktobase" lookup example

2025-09-17 Thread Max Chernoff via luatex
Hi Werner,

On Tue, 2025-09-09 at 05:16 +, Werner LEMBERG wrote:
> using the original and patched OpenType font produces the attached
> difference –

There's not enough context in the attached diff to actually see
anything. Can you try again but with "table.serialize" replaced with
"dot_inspect":

function dot_inspect(value, parents)
if not parents then
parents = {}
end
local literal = false

if type(value) == "table" then
local found = false
for k, v in pairs(value) do
local k = k
found = true
if type(k) ~= "string" then
k = "[" .. tostring(k) .. "]"
end
dot_inspect(v, table.imerged(parents, {k}))
end
if found then
return
else
value = "(empty table)"
literal = true
end
elseif type(value) == "function" then
local info = debug.getinfo(value)
value = info.source .. ":" .. info.linedefined
literal = true
end

if literal or (type(value) == "userdata") then
value = tostring(value)
else
value = string.format("%q", value)
end

value = table.concat(parents, ".") .. " = " .. value
print(value)
end

> the tiny modification in the font's 'mark2base' lookup
> explodes, causing four large, identical changes to `tfmdata`
> subtables.

Lua tables are reference types, so those 4 subtables may actually be the
same table, meaning that a single modification might change all of them.
This is just a guess though; I may be wrong here.

Also, most of the subtables are unused, so you probably only need to
change a single value anyways.

> In other words, it is unrealistic to modify `tfmdata`
> directly, and I need to hook into LuaTeX's OpenType font handler one
> step earlier, AFAICS.  How can I do that?

With the caveat that this depends on internal implementation details,
and is therefore unsupported and could change at any time,
"fonts.handlers.otf.readers.loadfont" is the earliest point that you can
modify the font data:

\input{luaotfload.sty}

\directlua{
--[[ Force the font to be reloaded every time; only needed for testing. 
]]
do
local saved = luaotfload.fontloader.containers.read
function luaotfload.fontloader.containers.read(...)
local tfmdata = saved(...)
if  tfmdata and
file.basename(tfmdata.resources.filename) == 
"EBGaramond-Regular.otf"
then
tfmdata.tableversion = -1
end
return tfmdata
end
end
do
local saved = fonts.handlers.otf.readers.loadfont
function fonts.handlers.otf.readers.loadfont(...)
local tfmdata = saved(...)
--[[ Do something with the font data here. ]]
inspect {
["in"] = { ... },
["out"] = tfmdata,
}
return tfmdata
end
end
}

\font\ebgaramond={file:EBGaramond-Regular.otf} at 12pt

\ebgaramond Hello, world!

\bye

The table is actually quite a bit larger and more complicated than the
one in "luaotfload.patch_font", so this probably isn't very useful for
you.

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-17 Thread Max Chernoff via luatex
Hi Werner,

On Tue, 2025-09-16 at 08:02 +, Werner LEMBERG wrote:
> > Attached is an improved version.
>
> As usual: right after sending an improved version I find a problem :-)
>
> The old version only checked the first lookup of a 'mark' feature;
> this is now fixed.

Looks good to me. One other tip is that you can fake a "continue"
statement with "goto", which will let you reduce the amount of nested
"if"s:

for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if not coverage then
goto continue
end

local coverage_mark = coverage[uni_mark]
if not coverage_mark then
goto continue
end

mark_glyph_in_coverage = true
local coverage_base = coverage_mark[1][uni_base]
if not coverage_base then
goto continue
end

base_glyph_in_coverage = true
for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
report("log", 0, "add_to_mark",
"found base-mark glyph combination '%s+%s'"
.. " in 'mark' feature of font '%s'",
base, mark, filename)
-- Report OpenType value for anchor class, not
-- the one used in luatex.
report("log", 0, "add_to_mark",
"adding glyph '%s' to anchor class %d",
acc, i - 1)
coverage[uni_acc] = {baseclass, {x, y}}
return
end
end
::continue::
end

Another suggestion is to use a more descriptive name for your
"luaotfload.patch_font" callback, since "patch-fonts" isn't the most
informative. "add-to-mark" would probably be a good name.

But otherwise, this looks great.

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-16 Thread Werner LEMBERG

> Attached is an improved version.

As usual: right after sending an improved version I find a problem :-)

The old version only checked the first lookup of a 'mark' feature;
this is now fixed.


Werner
% version 2025-Sep-16

\documentclass{article}

\usepackage{luacode}
\usepackage{fontspec}

\begin{luacode*}
  local report = luaotfload.log.report

  local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
  report("both", 0, "add_to_mark",
 "error: there is no glyph '%s' in font '%s'",
 glyph, fontname)
end
return unicode
  end

  local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: base glyph '%s' not in 'mark' feature of font '%s'",
   base_glyph, fontname)
  end

  local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: mark glyph '%s' not in 'mark' feature of font '%s'",
   mark_glyph, fontname)
  end

  local patch_functions = {}

  -- Add glyph with name ACC as a diacritic to an existing 'mark'
  -- feature of font file FONT.  The arguments BASE and MARK specify
  -- names of glyphs that represent the desired anchor class that
  -- connects base and mark glyphs, respectively, to which ACC should
  -- be added.  The coordinates X and Y give the position of the
  -- anchor for ACC.
  --
  -- FONT should be the base name of an OpenType font, i.e., a file
  -- name without a path (example: `foo.otf`).
  --
  -- BASE and MARK must exist in FONT, and there must be an entry in
  -- the 'mark' feature that pairs them.  ACC must exist in FONT, too.
  --
  -- This function can be used repeatedly.  Note, however, that a mark
  -- glyph can only be part of a single anchor class.  As a
  -- consequence, a second call to this function with the same
  -- argument ACC (for a particular font) that results in a different
  -- anchor class overrides the result of the first call.
  function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
  patch_functions[font] = {}
end

local function patch_function(fontdata)
  local path = fontdata.specification.filename
  local filename = file.basename(path)

  local unicodes = fontdata.resources.unicodes
  if not unicodes then
report("both", 0, "add_to_mark",
   "error: 'unicodes' subtable missing;"
   .. " cannot map glyph names to Unicode")
return
  end

  local uni_acc = glyph_to_unicode(acc, unicodes, filename)
  local uni_base = glyph_to_unicode(base, unicodes, filename)
  local uni_mark = glyph_to_unicode(mark, unicodes, filename)
  if not (uni_acc and uni_base and uni_mark) then
return
  end

  local have_mark_feature = false
  local base_glyph_in_coverage = false
  local mark_glyph_in_coverage = false

  for _, sequence in ipairs(fontdata.resources.sequences) do
if (sequence.type == "gpos_mark2base"
and sequence.features["mark"]) then
  have_mark_feature = true
  for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if coverage then
  local coverage_mark = coverage[uni_mark]
  if coverage_mark then
mark_glyph_in_coverage = true
local coverage_base = coverage_mark[1][uni_base]
if coverage_base then
  base_glyph_in_coverage = true
  for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
  report("log", 0, "add_to_mark",
 "found base-mark glyph combination '%s+%s'"
 .. " in 'mark' feature of font '%s'",
 base, mark, filename)
  -- Report OpenType value for anchor class, not
  -- the one used in luatex.
  report("log", 0, "add_to_mark",
 "adding glyph '%s' to anchor class %d",
 acc, i - 1)
  coverage[uni_acc] = {baseclass, {x, y}}
  return
end
  end
end
  end
end
  end
end
  end

  if not have_mark_feature then
report("log", 0, "add_to_mark",
   "no 'mark' feature in font '%s'",
   filename)
  else
if not base_glyph_in_coverage then
  missing_base_glyph(base, filename)
end
if not mark_glyph_in_coverage then
  missing_mark_glyph(mark, filename)
end
  end

  return
end

table.insert(patch_functions[font], patch_function)

luatexbase.add_to_callback(
  "luaotfload.patch_font",
  function(fontdata)
local path 

Re: [luatex] searching a "marktobase" lookup example

2025-09-15 Thread Werner LEMBERG

>> I've come up with the attached solution, which works fine for the
>> small example.  Not sure whether this is the correct way, though.
> 
> A couple small comments: [...]
>
> Otherwise, it looks good to me.

Thanks a lot for the review!  Attached is an improved version.


 Werner
\documentclass{article}

\usepackage{luacode}
\usepackage{fontspec}

\begin{luacode*}
  local report = luaotfload.log.report

  local function glyph_to_unicode(glyph, unicodes, fontname)
local unicode = unicodes[glyph]
if not unicode then
  report("both", 0, "add_to_mark",
 "error: there is no glyph '%s' in font '%s'",
 glyph, fontname)
end
return unicode
  end

  local function missing_base_glyph(base_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: base glyph '%s' not in 'mark' feature of font '%s'",
   base_glyph, fontname)
  end

  local function missing_mark_glyph(mark_glyph, fontname)
report("both", 0, "add_to_mark",
   "error: mark glyph '%s' not in 'mark' feature of font '%s'",
   mark_glyph, fontname)
  end

  local patch_functions = {}

  -- Add glyph with name ACC as a diacritic to an existing 'mark'
  -- feature of font file FONT.  The arguments BASE and MARK specify
  -- names of glyphs that represent the desired anchor class that
  -- connects base and mark glyphs, respectively, to which ACC should
  -- be added.  The coordinates X and Y give the position of the
  -- anchor for ACC.
  --
  -- FONT should be the base name of an OpenType font, i.e., a file
  -- name without a path (example: `foo.otf`).
  --
  -- BASE and MARK must exist in FONT, and there must be an entry in
  -- the 'mark' feature that pairs them.  ACC must exist in FONT, too.
  --
  -- This function can be used repeatedly.  Note, however, that a mark
  -- glyph can only be part of a single anchor class.  As a
  -- consequence, a second call to this function with the same
  -- argument ACC (for a particular font) that results in a different
  -- anchor class overrides the result of the first call.
  function add_to_mark_feature(font, acc, base, mark, x, y)
if not patch_functions[font] then
  patch_functions[font] = {}
end

local function patch_function(fontdata)
  local path = fontdata.specification.filename
  local filename = file.basename(path)

  local unicodes = fontdata.resources.unicodes
  if not unicodes then
report("both", 0, "add_to_mark",
   "error: 'unicodes' subtable missing;"
   .. " cannot map glyph names to Unicode")
return
  end

  local uni_acc = glyph_to_unicode(acc, unicodes, filename)
  local uni_base = glyph_to_unicode(base, unicodes, filename)
  local uni_mark = glyph_to_unicode(mark, unicodes, filename)
  if not (uni_acc and uni_base and uni_mark) then
return
  end

  for _, sequence in ipairs(fontdata.resources.sequences) do
if (sequence.type == "gpos_mark2base"
and sequence.features["mark"]) then
  for _, step in ipairs(sequence.steps) do
local coverage = step.coverage
if coverage then
  local coverage_mark = coverage[uni_mark]
  if coverage_mark then
local coverage_base = coverage_mark[1][uni_base]
if coverage_base then
  for i, baseclass in ipairs(step.baseclasses) do
if baseclass[uni_base] then
  report("log", 0, "add_to_mark",
 "found base-mark glyph combination '%s+%s'"
 .. " in 'mark' feature of font '%s'",
 base, mark, filename)
  -- Report OpenType value for anchor class, not
  -- the one used in luatex.
  report("log", 0, "add_to_mark",
 "adding glyph '%s' to anchor class %d",
 acc, i - 1)
  coverage[uni_acc] = {baseclass, {x, y}}
  return
end
  end
else
  missing_base_glyph(base, filename)
  return
end
  else
missing_mark_glyph(mark, filename)
return
  end
end
  end
end
  end

  report("log", 0, "add_to_mark",
 "no 'mark' feature in font '%s'",
 filename)
  return
end

table.insert(patch_functions[font], patch_function)

luatexbase.add_to_callback(
  "luaotfload.patch_font",
  function(fontdata)
local path = fontdata.specification.filename
local filename = file.basename(path)

local patch_functions = patch_functions[filename]
if not patch_functions then
  return
end

for _, v in pairs(patch_functions) do
  v(fontda

Re: [luatex] searching a "marktobase" lookup example

2025-09-15 Thread Max Chernoff via luatex
Hi Werner,

On Mon, 2025-09-15 at 20:53 +, Werner LEMBERG wrote:
> I've come up with the attached solution, which works fine for the
> small example.  Not sure whether this is the correct way, though.

A couple small comments:

1.  You usually want "luacode*" instead of just "luacode", since only the
"*" environment makes "\" have "other" catcodes.

2.  You can freely change the function parameter names, so you can just
do

local function patch_function(fd)

instead of

local function patch_function(fontdata)
local fd = fontdata

3.  Lua lets you loop over a table directly, so instead of

local i = 1
while true do
local seq = fd.resources.sequences[i]
local st = seq.steps
for j = 1, #st do
local cov = st[j].coverage
local bc = st[j].baseclasses
for k = 1, #bc do
local bck = bc[k]
some_function(bck)
end
end
i = i + 1
end

you can do

for i, seq in ipairs(fd.resources.sequences[i])
for j, step in ipairs(seq.steps) do
for k, bck in ipairs(step.baseclasses) do
some_function(bck)
end
end
end

Otherwise, it looks good to me.

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-13 Thread Max Chernoff via luatex
Hi Werner,

On Sat, 2025-09-13 at 10:30 +, Werner LEMBERG wrote:
> > The current Lua font loading code essentially goes directly from the
> > binary font files to the "tfmdata" table. Specifically, mark-to-base
> > is handled by lines 1906--1908 of "fontloader-font-dsp.lua".
>
> Thanks again, very helpful.  BTW, how could I debug this code?  What's
> the right way to use, say, `debugger.lua` while loading a font?

Lua does have a "debug" module

https://www.lua.org/manual/5.3/manual.html#6.10

(that you can enable in LuaTeX with "--luadebug"), but I've never found
it helpful for debugging. Usually I just manually add "print(var)"
and/or "inspect(var)" calls to the source code. You can use
"debug.traceback()" without "--luadebug", so sometimes
"print(debug.traceback())" can be helpful. Another common trick to trace
a function is the following:

-- Assuming that a function "some_function" exists
do
local saved = some_function
function some_function(...)
local out = { saved(...) }
inspect {
["in"] = { ... },
out = out,
}
return unpack(out)
end
end

The ConTeXt fonts manual ("texdoc fonts-mkiv") and the CLD manual
("texdoc cld") might also be useful; although they only claim to
describe ConTeXt, 90% of both manuals apply to LuaLaTeX as well.

Also, luaotfload only loads the "fontloader-2023-12-28.lua" file, but it
does this at runtime, so you shouldn't need to rebuild the formats for
testing. However, this file has all the comments and indentation
stripped, so it's fairly challenging to read. So it's best to read the
"$TEXMFDIST/tex/luatex/luaotfload/fontloader-font-*.lua" files to figure
out what's going on, but then apply any changes to
"fontloader-2023-12-28.lua" directly.

Also, depending on what you're doing, the caches might cause problems,
so if something isn't working right, delete
"$TEXMFVAR/luatex-cache/generic/".

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-13 Thread Werner LEMBERG


> [... you should be safe to ignore anything with "rawdata" or
> "unscaled" in its name, which just leaves the
> "resources.sequences[i].steps[1].coverage" stuff.

OK, thanks.

> The current Lua font loading code essentially goes directly from the
> binary font files to the "tfmdata" table. Specifically, mark-to-base
> is handled by lines 1906--1908 of "fontloader-font-dsp.lua".

Thanks again, very helpful.  BTW, how could I debug this code?  What's
the right way to use, say, `debugger.lua` while loading a font?


Werner


Re: [luatex] searching a "marktobase" lookup example

2025-09-11 Thread Werner LEMBERG


Hello Hans,


> Also, one can - as I posted before - apply some positioning feature,
> so I don't see what use a mark related extension one would add even
> if I would do that just because it's easy.

I repeat: The change in the 'mark2base' table is just a few lines – I
have to add a single anchor position for the diacritic and to add this
diacritic to the lookup's coverage – six lines of code or so in total
in the `glyphs` table.  In `tfmdata`, as shown in the diff attached to
a previous mail, this corresponds to 1670(!) lines of code.  It is
this enormous discrepancy that irks me.

This is *not* a criticism of LuaTeX.  I fully understand why `tfmdata`
is constructed the way it is.  It's just frustrating that there isn't
a simple and elegant solution to get the desired effect.  I started
with reading Paul Isambert's excellent TUGboat article, not being
aware that the described machinery (involving the `glyphs` table)
isn't (any longer?) actively used for loading fonts in luatex (and
LuaLaTeX in particular).

> Also, one can - as I posted before - apply some positioning feature,

You can do that for selected pairs, but to do that *in general*, as
the 'mark2base' feature does, you need many, many such pairs, which I
consider not elegant.

> so I don't see what use a mark related extension one would add even
> if I would do that just because it's easy.

It's only easy if you want to handle selected pairs.

> Maybe if someone asked on the context list I'd bother.

Well, I'm not using ConTeXt...

Anyway, I'll try to find a programmatic solution based on `tfmdata`:

* Reconstruct the `mark2base` coverage table.
* Reconstruct a table of top anchor points of all related base
  characters.
* Construct a proper entry for `tfmdata` based on those two tables.


Werner



Re: [luatex] searching a "marktobase" lookup example

2025-09-11 Thread Max Chernoff via luatex
Hi Werner,

On Wed, 2025-09-10 at 06:00 +, Werner LEMBERG wrote:
> > Can you try again but with "table.serialize" replaced with [...]
>
> Thanks!  [It took me a while to find out that I have to use the
> `--luadebug` command-line option so that `debug.getinfo` is defined.]

Whoops, sorry about that.

> The diff is attached (compressed this time, calling `diff` on sorted
> input).

Ok, you should be safe to ignore anything with "rawdata" or "unscaled"
in its name, which just leaves the
"resources.sequences[i].steps[1].coverage" stuff.

> I think everything boils down to the question whether LuaTeX by
> default loads a font internally with
>
> ```
> f = fontloader.open("EBGaramond-Regular.otf")
> fonttable = fontloader.to_table(f)
> ```

No, nothing (that I'm aware of) uses the builtin "fontloader" library;
the current parser is 100% Lua.

> then converting `fonttable` to a `tfmdata` structure.  I want to
> access `fonttable` before this step happens.  How can I do that?

The current Lua font loading code essentially goes directly from the
binary font files to the "tfmdata" table. Specifically, mark-to-base is
handled by lines 1906--1908 of "fontloader-font-dsp.lua".

> > With the caveat that this depends on internal implementation
> > details, and is therefore unsupported and could change at any time,
> > "fonts.handlers.otf.readers.loadfont" is the earliest point that you
> > can modify the font data: [...]
>
> Thanks, but again, this code is manipulating `tfmdata` AFAICS, so
> there is no advantage w.r.t. compactness of the 'mark2base' lookup
> data.

If you look at the function that "fonts.handlers.otf.readers.loadfont"
call internally ("loadfontdata", "fontloader-font-otr.lua" lines
2240--2306), you can see that the function is parsing the font data
directly ("readulong" and any of the "*cardinal*" functions are for
parsing binary data), so there's really nowhere earlier that you can
hook into.

> I don't think so.  In `tfmdata`, the data from the 'mark2base' lookup
> is no longer represented in a compact form.  Instead, it is expanded
> so that each accent glyph has all the necessary deltas for all base
> glyphs.  While this speeds up the processing of the 'mark2base'
> lookup, it makes manipulation much more complicated.

Internally, the code directly adds the data in the expanded form by
looping over the characters. So since there aren't any helper functions,
you'll have to loop over all the characters yourself. Which I agree is
annoying, but even if Hans were to add a helper function for this, the
LaTeX team has stopped importing new font loader code, so you wouldn't
actually be able to use it.

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-10 Thread Hans Hagen via luatex

On 9/10/2025 9:41 AM, Max Chernoff via luatex wrote:


Internally, the code directly adds the data in the expanded form by
looping over the characters. So since there aren't any helper functions,
you'll have to loop over all the characters yourself. Which I agree is
annoying, but even if Hans were to add a helper function for this, the
LaTeX team has stopped importing new font loader code, so you wouldn't
actually be able to use it.


I'm not sure what you mean with 'expanded' here but your last sentence 
is a good reason for me to not waste time explaining what really happens 
deep down and why en gets mentioned tables. Also, one can - as I posted 
before - apply some positioning feature, so I don't see what use a mark 
related extension one would add even if I would do that just because 
it's easy. Maybe if someone asked on the context list I'd bother.


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-


Re: [luatex] searching a "marktobase" lookup example

2025-09-08 Thread Werner LEMBERG

> Thanks!  However, how can I access the `glyphs` table via `tfmdata`?
> Or do I have to manipulate another structure?  I looked into
> `tfmdata.resources.sequences[50].steps[1].coverage[868]`, but this
> is much more low-level and quite ugly to work with IMHO since I have
> to explicitly compute the offsets between the base glyph and the
> diacritic...

To give more details: what I would like to add, as mentioned earlier,
is this for glyph 'uni0364' in the `glyphs` table, which is the
complete difference between the original and patched 'EB Garamond'
font file.

```
anchors = {
  mark = {
['Anchor-0'] = {
  x = 115,
  y = 440,
  lig_index = 0,
}
  }
}
class = mark
```

However, calling the following LaTeX file to dump the `tfmdata` table

```
\documentclass{article}

\usepackage{fontspec}

\directlua{
  local patch_functions = {}

  patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
local data = table.serialize(tfmdata)
io.savedata("ebgaramond.dump", data)
  end

  luatexbase.add_to_callback(
"luaotfload.patch_font",
function(tfmdata, specification, font_id)
  local path = tfmdata.specification.filename
  local filename = file.basename(path)
  local patch_function = patch_functions[filename]

  if not patch_function then
return
  end
  patch_function(tfmdata)
end,
"patch-fonts"
  )
}

\setmainfont{EB Garamond}

\begin{document}
schoͤn
\end{document}
```

using the original and patched OpenType font produces the attached
difference – the tiny modification in the font's 'mark2base' lookup
explodes, causing four large, identical changes to `tfmdata`
subtables.  In other words, it is unrealistic to modify `tfmdata`
directly, and I need to hook into LuaTeX's OpenType font handler one
step earlier, AFAICS.  How can I do that?


Werner
--- ebgaramond.dump	2025-09-09 06:55:06.565403863 +0200
+++ ebgaramond.dump.fixed	2025-09-09 06:51:49.856353700 +0200
@@ -189845,6 +189845,848 @@
 },
 { -63, 440 },
},
+   [868]={
+{
+ [71]={ 403, 675 },
+ [97]={ 201, 440 },
+ [99]={ 237, 440 },
+ [101]={ 222, 440 },
+ [103]={ 215, 440 },
+ [105]={ 124, 440 },
+ [106]={ 116, 440 },
+ [109]={ 392, 440 },
+ [110]={ 271, 440 },
+ [111]={ 254, 440 },
+ [112]={ 263, 440 },
+ [114]={ 191, 440 },
+ [115]={ 176, 440 },
+ [116]={ 136, 440 },
+ [117]={ 252, 440 },
+ [118]={ 218, 440 },
+ [119]={ 343, 440 },
+ [120]={ 237, 440 },
+ [121]={ 242, 440 },
+ [122]={ 201, 440 },
+ [224]={ 201, 440 },
+ [225]={ 201, 440 },
+ [226]={ 201, 440 },
+ [227]={ 201, 440 },
+ [228]={ 201, 440 },
+ [229]={ 201, 440 },
+ [230]={ 302, 440 },
+ [231]={ 237, 440 },
+ [232]={ 222, 440 },
+ [233]={ 222, 440 },
+ [234]={ 222, 440 },
+ [235]={ 222, 440 },
+ [236]={ 124, 440 },
+ [237]={ 124, 440 },
+ [238]={ 124, 440 },
+ [239]={ 124, 440 },
+ [241]={ 271, 440 },
+ [242]={ 254, 440 },
+ [243]={ 254, 440 },
+ [244]={ 254, 440 },
+ [245]={ 254, 440 },
+ [246]={ 254, 440 },
+ [248]={ 254, 440 },
+ [249]={ 252, 440 },
+ [250]={ 252, 440 },
+ [251]={ 252, 440 },
+ [252]={ 252, 440 },
+ [253]={ 242, 440 },
+ [255]={ 242, 440 },
+ [257]={ 201, 440 },
+ [259]={ 201, 440 },
+ [261]={ 201, 440 },
+ [263]={ 237, 440 },
+ [265]={ 237, 440 },
+ [267]={ 237, 440 },
+ [269]={ 237, 440 },
+ [275]={ 222, 440 },
+ [277]={ 222, 440 },
+ [279]={ 222, 440 },
+ [283]={ 222, 440 },
+ [284]={ 403, 675 },
+ [285]={ 215, 440 },
+ [286]={ 403, 675 },
+ [287]={ 215, 440 },
+ [288]={ 403, 675 },
+ [289]={ 215, 440 },
+ [290]={ 403, 675 },
+ [291]={ 215, 440 },
+ [297]={ 124, 440 },
+ [299]={ 124, 440 },
+ [301]={ 124, 440 },
+ [303]={ 124, 440 },
+ [305]={ 124, 440 },
+ [309]={ 116, 440 },
+ [324]={ 271, 440 },
+ [326]={ 271, 440 },
+ [328]={ 271, 440 },
+ [329]={ 352, 440 },
+ [333]={ 254, 440 },
+ [335]={ 254, 440 },
+ [337]={ 254, 440 },
+ [341]={ 191, 440 },
+ [343]={ 191, 440 },
+ [345]={ 191, 440 },
+ [347]={ 176, 440 },
+ [349]={ 176, 440 },
+ [351]={ 176, 440 },
+ [353]={ 176, 440 },
+ [355]={ 136, 440 },
+ [357]={ 136, 440 },
+ [359]={ 147, 440 },
+ [361]={ 252, 440 },
+ [363]={ 252, 440 },
+ [365]={ 252, 440 },
+ [367]={ 252, 440 },
+ [369]={ 252, 440 },
+ [371]={ 252, 440 },
+ [373]={ 343

Re: [luatex] searching a "marktobase" lookup example

2025-09-07 Thread Max Chernoff via luatex
Hi Werner,

On Mon, 2025-09-08 at 06:02 +, Werner LEMBERG wrote:
> It seems that I have to do a brute-force approach by directly
> modifying `fonttable`.  My question is: How can I insert this data to
> the `fonttable` structure whenever the (unmodified)
> `EBGaramond-Regular.otf` gets loaded?  Is there a hook for that?

Something like the following should work (untested):

local patch_functions = {}

patch_functions["EBGaramond-Regular.otf"] = function(tfmdata)
tfmdata.some_key = "some value"
end

luatexbase.add_to_callback(
"luaotfload.patch_font",
function(tfmdata, specification, font_id)
local path = tfmdata.specification.filename
local filename = file.basename(path)
local patch_function = patch_functions[filename]

if not patch_function then
return
end
patch_function(tfmdata)
end,
"patch-fonts"
)

Thanks,
-- Max



Re: [luatex] searching a "marktobase" lookup example

2025-09-07 Thread Werner LEMBERG


> BTW, it's very easy to patch the original `EBGaramond-Regular.otf`
> file using the XML dump as produced by the `ttx` font
> compiler/decompiler: For the 'mark' lookup you add glyph 'uni0364'
> to its 'MarkCoverage' table (as the 28th entry counting from zero),
> then adding an entry in the 'MarkArray' table:
> 
> ```
> 
>   
>   
> 
> 
>   
> 
> ```
> 
> However, I consider this as a last-resort solution that I would like
> to avoid.
> 
> So I ask again: Is there a solution to construct a tiny 'marktobase'
> feature with `fonts.handlers.otf.addfeature` (or something else)?

I had a closer look into the files of 'luaotfload', and my conclusion
is that there is no 'fonts.handlers.otf.addfeature' support for the
GPOS `mark2base` lookup type at all.  Sigh.

With

```
f = fontloader.open("EBGaramond-Regular.otf")
fonttable = fontloader.to_table(f)
```

the OTF file modified as described contains the following data added
to `fonttable.glyphs[741]` (which is for glyph `uni0364`):

```
["anchors"] = {
  ["mark"] = {
["Anchor-0"] = {
  ["x"] = 115,
  ["y"] = 440,
  ["lig_index"] = 0 }
  }
},
["class"] = mark,
```

It seems that I have to do a brute-force approach by directly
modifying `fonttable`.  My question is: How can I insert this data to
the `fonttable` structure whenever the (unmodified)
`EBGaramond-Regular.otf` gets loaded?  Is there a hook for that?


Werner


Re: [luatex] searching a "marktobase" lookup example

2025-09-03 Thread Werner LEMBERG
> As a not-font-feature solution, I have found this [...]

Thanks!


Werner


Re: [luatex] searching a "marktobase" lookup example

2025-09-03 Thread Hans Hagen via luatex

On 9/1/2025 2:09 PM, Werner LEMBERG wrote:



I have a problem with the EB Garamond family: it contains the glyph
"ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
neither an anchor point nor is it part of the font's "mark" table –
in particular, I need support for the combination "oͤ", as used in
old German texts.

To fix this, I would like to use `fonts.handlers.otf.addfeature`.
Is there an example somewhere how to do that?  I could only find
samples for other, simpler GSUB and GPOS lookup types but nothing
for "marktobase", which is needed here.  In case there is
documentation already available please give me a link.


it should be helpful if you add small example (even if it does not
do what you need) to have a bit of context.


Here it is, using current git of TeXLive.  The attached images show
the current and desired results.

```
\documentclass{article}

\usepackage{ebgaramond}

\begin{document}
gehoͤrt
\end{document}
```


Perhaps the pair feature as in
https://articles.contextgarden.net/journal/2017/27-76.pdf


I've seen this already, thanks, but the structure of the 'marktobase'
feature is completely different.

attached is how i'd do it in context ...

\startluacode
fonts.handlers.otf.addfeature {
name = "kern",
type = "pair",
data = {
["o"] = { [0x364] = { false, { -150, 0, 0, 0 } } },
}
}
\stopluacode

\setupbodyfont[ebgaramond]

\startTEXpage[offset=1TS]
mswoͤrd
\stopTEXpage

... i have no clue if it works out in latex

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-\startluacode
fonts.handlers.otf.addfeature {
name = "kern",
type = "pair",
data = {
["o"] = { [0x364] = { false, { -150, 0, 0, 0 } } },
}
}
\stopluacode

\setupbodyfont[ebgaramond]

\startTEXpage[offset=1TS]
mswoͤrd
\stopTEXpage


extensions-012.pdf
Description: Adobe PDF document


Re: [luatex] searching a "marktobase" lookup example

2025-09-01 Thread Werner LEMBERG


> attached is how i'd do it in context ...  [...]

Thanks, this solution also came to my mind, and it certainly works,
see

  
https://tex.stackexchange.com/questions/312154/how-to-adjust-font-features-in-luatex

which has quite a comprehensive description of some LuaTeX font
features.

However, I consider 'kern' not as a correct replacement for a
'marktobase' feature, since the latter makes U+0364 work with all base
characters defined in the 'mark' feature.  In particular, AFAICS, your
sulution fails for uppercase combinations like "Oͤ" because standard
kerning doesn't have vertical offsets.

BTW, it's very easy to patch the original `EBGaramond-Regular.otf`
file using the XML dump as produced by the `ttx` font
compiler/decompiler: For the 'mark' lookup you add glyph 'uni0364' to
its 'MarkCoverage' table (as the 28th entry counting from zero), then
adding an entry in the 'MarkArray' table:

```

  
  


  

```

However, I consider this as a last-resort solution that I would like
to avoid.

So I ask again: Is there a solution to construct a tiny 'marktobase'
feature with `fonts.handlers.otf.addfeature` (or something else)?


Werner



Re: [luatex] searching a "marktobase" lookup example

2025-09-01 Thread luigi scarso
On Mon, 1 Sept 2025 at 14:10, Werner LEMBERG  wrote:

>
> >> I have a problem with the EB Garamond family: it contains the glyph
> >> "ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
> >> neither an anchor point nor is it part of the font's "mark" table –
> >> in particular, I need support for the combination "oͤ", as used in
> >> old German texts.
> >>
> >> To fix this, I would like to use `fonts.handlers.otf.addfeature`.
> >> Is there an example somewhere how to do that?  I could only find
> >> samples for other, simpler GSUB and GPOS lookup types but nothing
> >> for "marktobase", which is needed here.  In case there is
> >> documentation already available please give me a link.
> >
> > it should be helpful if you add small example (even if it does not
> > do what you need) to have a bit of context.
>
> Here it is, using current git of TeXLive.  The attached images show
> the current and desired results.
>
> ```
> \documentclass{article}
>
> \usepackage{ebgaramond}
>
> \begin{document}
> gehoͤrt
> \end{document}
> ```
>
> > Perhaps the pair feature as in
> > https://articles.contextgarden.net/journal/2017/27-76.pdf
>
> I've seen this already, thanks, but the structure of the 'marktobase'
> feature is completely different.
>
>
> Werner
>

As a not-font-feature solution,  I have found this

https://tex.stackexchange.com/questions/694650/increase-size-of-superscript-letter-diacritics
(method=pdfstringdef; I guess that one needs to  to fix the macro \foo for
a correct  the actualtext gehoͤrt)

See
https://ctan.org/pkg/accsupp
and
https://latex3.github.io/tagging-project/tagging-status/
for the accsupp status.

--
luigi


Re: [luatex] searching a "marktobase" lookup example

2025-09-01 Thread Werner LEMBERG

>> I have a problem with the EB Garamond family: it contains the glyph
>> "ͤ" (U+0364, COMBINING LATIN SMALL LETTER E), however, it has
>> neither an anchor point nor is it part of the font's "mark" table –
>> in particular, I need support for the combination "oͤ", as used in
>> old German texts.
>>
>> To fix this, I would like to use `fonts.handlers.otf.addfeature`.
>> Is there an example somewhere how to do that?  I could only find
>> samples for other, simpler GSUB and GPOS lookup types but nothing
>> for "marktobase", which is needed here.  In case there is
>> documentation already available please give me a link.
>
> it should be helpful if you add small example (even if it does not
> do what you need) to have a bit of context.

Here it is, using current git of TeXLive.  The attached images show
the current and desired results.

```
\documentclass{article}

\usepackage{ebgaramond}

\begin{document}
gehoͤrt
\end{document}
```

> Perhaps the pair feature as in
> https://articles.contextgarden.net/journal/2017/27-76.pdf

I've seen this already, thanks, but the structure of the 'marktobase'
feature is completely different.


Werner