I might have some time next week for a quick frankenstein proof of concept.
I should have a thing I wrote a while back in postscript (this was
originally
meant to do PostScript to TeX for a specific dvips style of writing, it
uses poscript and produces
a sidecar file with various kinds of information), that might serve this
purpose ~ok.

One thing you said about fonts before caught my eye:
you said that if the glyph definition changes the image might change, I
agree but isn't that an orthogonal problem to this?
In my mind the font content (the look of the glyphs, its metrics,
whathaveyou) and the work done by the
layout engine to decide where these glyphs and other graphics elements
should be placed are separate things to test.

Also I expect changes to the font definition are a) comparatively rare, and
b) done by different people than
layout algorithm folks. But more importantly isn't it the case that
normally if you alter the glyphs
you're likely to alter the metrics?

It seems to me this would knock (a bit) around all layouts anyways, no? I'm
thinking it would invalidate all regressions
using that font, I guess (and in a way "for the wrong reason", I might add,
in that the layout engine is just following
new metric definitions without a real algorithmic change).

Am I thinking about this all upside down?

L


On Fri, Jun 20, 2025 at 10:06 PM Han-Wen Nienhuys <hanw...@gmail.com> wrote:

> Sorry for the delay in getting back.
>
> To answer your questions: the convolution is a box filter, because it
> can be implemented efficiently: for a NxM image, you do a single N*M
> precomputation, and can then compute each convolution in N*M
> operations.
>
> Regarding performance, consider (cpu @ 2ghz)
>
> MAE:
>
>     $ go run ./cmd/compare/ -algorithm=mae -gs_jobs=1 -cmp_jobs=1
> -file_regexp 'eps'  ../lilypond/{d1,d2} /tmp/tmp.UULz7I9RKh/output/
>     2025/06/20 14:12:38 Convert 2 EPS files using 1 cores (batch=true)
> to PNG in 369.2581ms (184.62905ms/file)
>     2025/06/20 14:12:38 compared 1 PNG image pairs using 1 cores
> (imagemagick=false) in 51.97695ms (51.97695ms / pair)
>
>     $ file ../lilypond/les-nereides.png
>     ../lilypond/les-nereides.png: PNG image data, 835 x 1181,
> 8-bit/color RGB, non-interlaced
>
> There are ~1M pixels in the image, so that's 100 cycles per pixel,
> which seems like a bit too much, but actually significant time is
> spent decoding the png.
>
> The filter version convolves at 10 resolutions (window = 3 ... 2049),
> and is roughly 10 times slower:
>
>     $ go run ./cmd/compare/ -algorithm=filter -gs_jobs=1 -cmp_jobs=1
> -file_regexp 'eps'  ../lilypond/{d1,d2} /tmp/tmp.UULz7I9RKh/output/
>     2025/06/20 14:14:55 Convert 2 EPS files using 1 cores (batch=true)
> to PNG in 378.991521ms (189.49576ms/file)
>     2025/06/20 14:14:55 compared 1 PNG image pairs using 1 cores
> (imagemagick=false) in 448.789022ms (448.789022ms / pair)
>
> I am sure there are numeric libraries that can get the result computed
> more quickly, but extra dependencies are a nightmare for packaging, so
> we should avoid it if possible.
>
> Our pixels are mostly blank, so this isn't terribly efficient. If
> converting the image to vectors, you get:
>
>     $ time ps2ps -dNoOutputFonts les-nereides.ps vec.ps
>     real 0m0.181s
>     $ time inkscape --export-filename=out.dxf vec.ps
>     real 0m3.139s
>     $ grep VERTEX out.dxf  | wc -l
>     9998
>
> 10k points probably represents ~80kb of data? I agree that you'll be
> able to process that data more quickly than the 1mb of pixels we
> process using pixmaps.
>
> There are some considerations though:
>
> * Inkscape seems pretty slow for the ps -> dxf conversion. This can
>   probably be sped up, but how? It would need some nontrivial hacking
>   to interpret the PS and make it generate something that a vector
>   algorithm can handle.
>
> * If you convert to a vector format, changes in the representation of
>   the score (eg. how drawing routines are called) can generate
>   spurious differences. It is also not guaranteed that the Cairo and
>   PS generate something you can compare. This is also why "for
>   font-based elements the font/glyph id pair" is suspect: changes to
>   glyph shapes are also in scope for regtesting, and unless you expand
>   the glyphs in their curves, you'd miss them.
>
> So in short, you are right that a vector-based approach is potentially much
> faster. But without a prototype, it's hard to evaluate either
> performance or how well it works.
>
> Bitmaps are straightforward to manipulate, and we already have the
> dependencies (zlib, libpng) to read them. Altogether, this was about
> half a day of work, and seems like it could be an improvement over
> what we have.
>
> HTH
>


-- 
Luca Fascione

Reply via email to