Ok, so I whipped up a quick test based on that old PS conversion project of mine.
Here are some partial timings, taken on my mac laptop (2019, 2,3 GHz 8-Core Intel Core i9), all code single-threaded. Here's the execution log: Parsing scene '/Users/lukes/Documents/src/undvips/tex/schintro_outline.udps' Parse complete 1156.21 ms Scene contains 468180 glyphs (2.46958 us/glyph) Scene contains 16 fonts AS build complete 182.568 ms (0.389953 us/glyph) Explanation: - the program loads a document in my custom UDPS format (which is produced by those PostScript scripts). At the moment it's an entire book (this one: https://docs.scheme.org/schintro/, but in PostScript form). See below for an example of what this looks like. There are 321 pages in the document. I expect the parsing to be about linear in the number of lines in the UDPS file (this file has about 141k lines) - "scene" means the document I guess, sorry, force of habit - AS is the acceleration structure. Our builds would be superlinearly faster: the builder algorithm is O(nlog(n)), so building 321 little scenes is faster than building one "all together" scene (the new time is (nlog(n) - nlog(321)) where n is the number of objects (~468k) so for this example it's something like 40% faster, ~110ms. These are still averaging 1500 objects per page, so they're small scenes, but not _that_ small). For the comparison method I have in mind, I think the build time is a good proxy for the whole test execution, because pretty much the idea is that you load two scenes and assume they are about the same, then go ahead and build the AS on both at the same time (this does not need the primitives to be in the same order). Where the scenes are different, the builds will start to diverge in topology, and then in the end there will be a bunch of identical stuff, and just need to build a report on what we'd like to know about the difference between the two. Missing bits: if people think this idea has merit, we need something to stand in the role of my little UDPS file format. I see a couple different avenues 1. Rewrite this PS extraction thing so that it outputs more about the graphics in the PS - Measure its performance (it seems to fail on GPL Ghostscript 10.05.1, I don't know why at the moment). Regrettably I have no notes on how long this step took when I was last working on this 2. Alternatively, and possibly much better, pivot and implement a similar idea using mupdf library, but using the PDF directly, sidestepping the PostScript entirely 3. Alternatively still, instrument some late layer of ... "Cairo" (say) to emit UDPS directly from the C++ I think that I would do 1 only if PostScript was our true source of truth. I'm leaning towards 2 instead because in my mind at least for the final users, lilypond produces PDF, so we should verify that this specific product is healthy. This introduces a new dependency on mupdf (or some other thing we can use to traverse the objects in a PDF, maybe poppler might be more palatable). All the same, avenue 3 is potentially the fastest/cleanest (and it would allow us to instrument the test backend a little, to guide the comparison better). However the worry is that depending on how this "tap" is injected into the code, problems beyond it might be missed which would be ungood. Lastly, my little test is already built to support multiple pages, and this could be advantageous in that it would save us to run all the warmup code over and over again. I guess I'm saying: if we can make it work for the whole set at once we could be looking at ~1.5 seconds to produce the comparison result for the entire test suite. If I remember right it takes longer than this to generate the artifacts themselves, is that right? Cheers, Luca UDPS Sample for page 1: This just to give you a sense of what's in the file I'm parsing Command explanation: - bop - being of page - font - change font - txt - a short run of text - eop - end of page - shwpg - PostScript's ShowPage command (does nothing in for us) bop 0 font CMBX12 [0.0860938,0.0,0.0,-0.0860938,0.0,0.0] 20.74 spc 43.0339 xheight 38.151 txt org [-117.0,927.0] bbox [-113.672,866.846]-[7.56836,927.002] end [9.98567,927.0] 2 An txt org [42.9857,927.0] bbox [45.6543,868.018]-[130.534,927.002] end [132.96,927.0] 2 In txt org [129.96,927.0] bbox [131.787,872.445]-[253.516,927.523] end [255.936,927.0] 3 tro txt org [258.936,927.0] bbox [262.191,867.236]-[574.43,927.523] end [576.856,927.0] 7 duction txt org [608.856,927.0] bbox [610.677,872.445]-[692.415,927.523] end [694.842,927.0] 2 to txt org [726.842,927.0] bbox [732.178,866.976]-[821.061,928.044] end [823.815,927.0] 2 Sc txt org [820.815,927.0] bbox [824.398,867.367]-[1041.13,927.523] end [1043.78,927.0] 4 heme txt org [1074.78,927.0] bbox [1077.72,867.367]-[1226.42,927.523] end [1229.74,927.0] 3 and txt org [1261.74,927.0] bbox [1265.53,867.236]-[1361.56,927.523] end [1364.71,927.0] 3 its txt org [1397.71,927.0] bbox [1400.37,867.367]-[1816.19,943.669] end [1818.62,927.0] 8 Implemen txt org [1815.62,927.0] bbox [1817.45,867.236]-[2065.14,927.523] end [2067.56,927.0] 6 tation font CMR10 [0.0454545,0.0,0.0,-0.0454545,0.0,0.0] 10.95 spc 22.7213 xheight 19.4662 txt org [617.0,1114.0] bbox [618.555,1083.14]-[645.182,1114.0] end [647.99,1114.0] 1 P txt org [646.99,1114.0] bbox [648.877,1082.62]-[706.51,1114.52] end [707.976,1114.0] 3 aul txt org [722.976,1114.0] bbox [724.544,1083.14]-[764.632,1114.97] end [768.956,1114.0] 2 R. txt org [783.956,1114.0] bbox [784.733,1082.62]-[932.08,1122.72] end [935.893,1114.0] 7 Wilson, txt org [951.893,1114.0] bbox [953.385,1083.14]-[1046.84,1114.97] end [1047.86,1114.0] 4 Univ txt org [1046.86,1114.0] bbox [1048.1,1083.72]-[1130.84,1114.52] end [1133.79,1114.0] 5 ersit txt org [1132.79,1114.0] bbox [1133.63,1094.53]-[1155.76,1123.24] end [1156.78,1114.0] 1 y txt org [1172.78,1114.0] bbox [1174.01,1082.1]-[1211.91,1114.52] end [1209.77,1114.0] 2 of txt org [1224.77,1114.0] bbox [1226.4,1083.4]-[1255.76,1114.0] end [1257.76,1114.0] 1 T txt org [1253.76,1114.0] bbox [1255.0,1093.75]-[1337.01,1114.52] end [1338.73,1114.0] 4 exas font CMTT10 [0.0454545,0.0,0.0,-0.0454545,0.0,0.0] 10.95 spc 23.8607 xheight 19.4662 txt org [736.0,1176.0] bbox [736.719,1148.13]-[1215.01,1176.25] end [1215.82,1176.0] 20 wil...@cs.utexas.edu txt org [534.0,1239.0] bbox [534.521,1207.62]-[1084.98,1249.02] end [1085.79,1239.0] 23 http://www.cs.utexas.ed txt org [1084.79,1239.0] bbox [1085.32,1207.62]-[1249.51,1242.77] end [1252.73,1239.0] 7 u/users txt org [1251.73,1239.0] bbox [1254.33,1207.62]-[1418.85,1242.77] end [1419.66,1239.0] 7 /wilson eop shwpg -- Luca Fascione