At 15:11 19/09/2017 +0000, William Bader wrote:
>It would be possible to write a tool which could reliably detect
identical fonts in a PDF file,
There are already libraries that can read PDFs into a data structure and
then write a new PDF, for example, pdfsizeopt in python, poppler
<https://poppler.freedesktop.org/>https://poppler.freedesktop.org/ and
PoDoFo
<http://podofo.sourceforge.net/about.html>http://podofo.sourceforge.net/about.html
in C++, pdfclown
<https://sourceforge.net/projects/clown/>https://sourceforge.net/projects/clown/
in .net, PDFBox <http://pdfbox.apache.org/>http://pdfbox.apache.org/ in
java, iText <https://itextpdf.com/>https://itextpdf.com/ in java and c#,
pdfsam <http://www.pdfsam.org/>http://www.pdfsam.org/ in java. Maybe one
of them would be suitable as a starting point for writing a font merging tool.
Indeed, and if I was going to do this I would use MuPDF. Note that it will
likely be a slow job to run. You can't do the job until you have all the
PDF files collected into one, then you need to check each instance of each
font to see if its the same as any other font, and remove the other font,
updating the relevant Resources dictionaries. Fortunately you don't need to
alter any of the content streams. Finally you'd need to rewrite the PDF
file with a modified xref and the relevant font streams removed.
Of course, because you have a fixed workflow you *could* simply look for
the second and following instances of any font rather than checking them
all exhaustively, but I think it would be better to do the job right.
Firstly you'd be protected against any further changes in your workflow,
and secondly you would have a genuinely useful tool in its own right.
Ghostscript is entirely the wrong tool for that job. Its possible, but I
wouldn't want to write the PostScript program for it.
Ken
_______________________________________________
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel