Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 17:01, d...@gnu.org wrote:


PDFTeX apparently does merge subsetted fonts,


Which version?


so I don't think we should
need to include the complete fonts in order to get font merging. But we
probably should work with coding vectors so that we can use identical
font names, just sparsely populated.


ghostscript, called by lilypond to produce pdfs, needs three encoding vectors 
for
the emmentaler glyphs, and writes three copies of the font to the pdf file. I 
don't
see a way to avoid that.



I would then expect PDFTeX to
merge the sparsely populated fonts of identical name unless
\pdfinclusioncopyfonts is set.

Redundant coding vectors should have much less of an impact on the
intermediate file size than the full Emmentaler fonts.  Right?


That would mean to change e.g.

/Emmentaler-18 findfont dup length dict copy begin
/Encoding LilyNoteHeadEncoding def
/Emmentaler-18-N currentdict definefont pop end
/Emmentaler-18 findfont dup length dict copy begin
/Encoding LilyScriptEncoding def
/Emmentaler-18-S currentdict definefont pop end
/Emmentaler-18 findfont dup length dict copy begin
/Encoding LilyOtherEncoding def
/Emmentaler-18-O currentdict definefont pop end

in a way that gs does not include three copies of the Emmentaler-18 font.
I don't think that is possible.

cu,
 Knut


https://codereview.appspot.com/194090043/




___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

Knut Petersen knut_peter...@t-online.de writes:


On 15.01.2015 17:01, d...@gnu.org wrote:



PDFTeX apparently does merge subsetted fonts,



Which version?


Those versions having the \pdfincludedcopyfonts setting?


so I don't think we should
need to include the complete fonts in order to get font merging. But

we

probably should work with coding vectors so that we can use identical
font names, just sparsely populated.



ghostscript, called by lilypond to produce pdfs, needs three encoding
vectors for the emmentaler glyphs, and writes three copies of the font
to the pdf file. I don't see a way to avoid that.


There must be some workable solution for Asian fonts I should hope.
I can't believe that some 1 character font would be included
1/256 times in a PDF file when thoroughly used.

I don't have any workable experience myself.  It's just a this can't
possibly be the whole truth feeling.  Sometimes it just overtaxes my
imagination what kind of thing people are willing to put up with.  But I
sure hope this is not another such case.

https://codereview.appspot.com/194090043/
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 14:47, d...@gnu.org wrote:

On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote:

On 15.01.2015 13:15, mailto:d...@gnu.org wrote:
 On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:

 Ghostscript does the font merging.

 Any idea whether something could be done to make PDFTeX do the font
 merging instead when including all the PDF files?



No, not really. That would require a lot of work.


Judging from the documentation, that should be the default (namely, when
\pdfinclusioncopyfonts is at its default value of 0 and we are talking
about Type1 fonts).  Cf
URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726
for example.  So the question is what is keeping this from happening.
Maybe we need to call ps2pdf (when converting the fragments for
inclusion) with some particular options to keep the fonts in a mergeable


Current lilypond uses glyphshow to draw glyphs in postscript,
encoding vectors are not present.

As there is no direct equivalent of the postscript glyphshow operator
in the pdf language, ghostscript constructs a _new_ font with an
encoding vector, including only the subset of glyphs used in the document.
Ghostscript then uses the glyphs in that font indexed by the new encoding
vector. The name of that new font is derived from the original.

It's pretty unlikely that two lilypond scores use the exactly same
subset of glyphs in exactly the same order, so it's pretty likely that
the two new fonts are not identical. But they share (aside from the
prefix) their name.

pdflatex would need to inspect all glyphs in those fonts, detect which are
identical and construct up to three fonts (remember the size limit of
encoding vectors and the number of emmentaler glyphs) with encodings
from the fonts found in the lilypond pdfs. It then would need to recode
the data stream and everything would be fine.

Identical fonts could be expected to be merged by default and without
problems (my interpretation of the pdftex documentation is different).
But without --bigpdfs you do not have identical fonts in the lilypond
pdfs, even if you instruct ghostscript not to subset fonts. It ignores that
order because it cannot obey.

cu,
 Knut




___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

On 2015/01/15 15:06:06, Knut_Petersen_t-online.de wrote:

On 15.01.2015 14:47, mailto:d...@gnu.org wrote:
 On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote:
 On 15.01.2015 13:15, mailto:d...@gnu.org wrote:
  On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:
 
  Ghostscript does the font merging.
 
  Any idea whether something could be done to make PDFTeX do the

font

  merging instead when including all the PDF files?

 No, not really. That would require a lot of work.

 Judging from the documentation, that should be the default (namely,

when

 \pdfinclusioncopyfonts is at its default value of 0 and we are

talking

 about Type1 fonts).  Cf



URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726

 for example.  So the question is what is keeping this from

happening.

 Maybe we need to call ps2pdf (when converting the fragments for
 inclusion) with some particular options to keep the fonts in a

mergeable


Current lilypond uses glyphshow to draw glyphs in postscript,
encoding vectors are not present.



As there is no direct equivalent of the postscript glyphshow operator
in the pdf language, ghostscript constructs a _new_ font with an
encoding vector, including only the subset of glyphs used in the

document.

Ghostscript then uses the glyphs in that font indexed by the new

encoding

vector. The name of that new font is derived from the original.



It's pretty unlikely that two lilypond scores use the exactly same
subset of glyphs in exactly the same order, so it's pretty likely that
the two new fonts are not identical. But they share (aside from the
prefix) their name.



pdflatex would need to inspect all glyphs in those fonts, detect which

are

identical and construct up to three fonts (remember the size limit of
encoding vectors and the number of emmentaler glyphs) with encodings
from the fonts found in the lilypond pdfs. It then would need to

recode

the data stream and everything would be fine.



Identical fonts could be expected to be merged by default and without
problems (my interpretation of the pdftex documentation is different).


PDFTeX apparently does merge subsetted fonts, so I don't think we should
need to include the complete fonts in order to get font merging.  But we
probably should work with coding vectors so that we can use identical
font names, just sparsely populated.  I would then expect PDFTeX to
merge the sparsely populated fonts of identical name unless
\pdfinclusioncopyfonts is set.

Redundant coding vectors should have much less of an impact on the
intermediate file size than the full Emmentaler fonts.  Right?

https://codereview.appspot.com/194090043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

On 2015/01/15 07:08:33, lemzwerg wrote:

David's concerns are very specific to the Lilypond documentation, not

covering

the general case.  Many programs simply can't process PS output at

all, so the

suggestion to collect PS data that gets reduced later on is not

applicable.


The only valid alternative is to make Lilypond natively produce PDF,

but this is

a long-term solution.  And it seems to me that even then we will need

a

'--bigpdf' option (but implemented in a different way) to allow

optimal PDF

merging later on by post-processing tools.



For this reason I vote to include Knut's work right now, since it

quickly solves

the given issue in a reliable way, with the only ugliness of having

very large

intermediate files.


Reliable?  If I remember correctly, the tool used for combining the
fonts (ppdfsizeopt.py) fails on the PDF files from PDFLaTeX, so there
must be an additional iteration through GhostScript.  This additional
iteration will reencode and resample included bitmap graphics at some
command line option dependent resolution, correct?  What happens with
hyperlinks?  Has anybody checked those?

At any rate, I've taken a look at the description of pdfsizeopt, and it
would appear that it is optimized for working on PDF files created by
PDFTeX.  That would imply that it would be
a) really a good idea to get along without using Ghostscript as an
intermediary.  That seems like it would require fixing pdfsizeopt.  Its
project page contains a link Doesn't pdfsizeopt work with your PDF?
Report the issue.  Now there is a remarkable dearth of names on the web
pages, but from other projects and content under this account and the
account's name I should be surprised if this project is not owned by
Szabó Péter.  And I should be surprised if he does not manage to fix the
problem when reported or suggest a full quality workaround.
b) in a similar vein, I'd ask Péter for suggestions about the best
course for having the font compaction work without blowing up the
intermediate files all too much.  Of course I am speculating on him just
making pdfsizeopt do all the work, but even if not, he'll be likely to
come up with a good plan.

The downside to the choice of using pdfsizeopt here is that it does not
currently seem to be easily available preinstalled for Ubuntu (and it
has a number of dependencies making preinstallation desirable).  Maybe
that will change in future.

With regard to a PDF file example for pdfsizeopt, maybe reporting the
Notation manual is a bit unwieldy.  The Learning manual should likely
have the same kind of problems, right?

https://codereview.appspot.com/194090043/
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 10:45, d...@gnu.org wrote:


Reliable?  If I remember correctly, the tool used for combining the
fonts (ppdfsizeopt.py) 


Ghostscript does the font merging.


fails on the PDF files from PDFLaTeX, so there
must be an additional iteration through GhostScript.  This additional
iteration will reencode and resample included bitmap graphics at some
command line option dependent resolution, correct? 


pdfsizeopt.py does some optimization of the remaining fonts, it tries to
find better compression for images, etc.


What happens with hyperlinks?  Has anybody checked those?


BTW: All this has been documented in the commit message of the git-formatted
patch sent to lilypond-devel:

Internal hyperlinks are fully preserved with current ghostscript git master.

External hyperlinks (GoToR) _to_ a file processed this way are broken.
Fixing this would require major changes to ghostscript.

External hyperlinks _from_ a file processed this way to other pdfs are
preserved if the reader program isn't broken (acroread is not broken
in this respect, evince is).

For more details see Ghostscript bug #695747 
http://bugs.ghostscript.com/show_bug.cgi?id=695747#c22


At any rate, I've taken a look at the description of pdfsizeopt, and it
would appear that it is optimized for working on PDF files created by
PDFTeX.  That would imply that it would be
a) really a good idea to get along without using Ghostscript as an
intermediary.  That seems like it would require fixing pdfsizeopt.  Its
project page contains a link Doesn't pdfsizeopt work with your PDF?
Report the issue.  Now there is a remarkable dearth of names on the web
pages, but from other projects and content under this account and the
account's name I should be surprised if this project is not owned by
Szabó Péter.  And I should be surprised if he does not manage to fix the
problem when reported or suggest a full quality workaround.
b) in a similar vein, I'd ask Péter for suggestions about the best
course for having the font compaction work without blowing up the
intermediate files all too much.  Of course I am speculating on him just
making pdfsizeopt do all the work, but even if not, he'll be likely to
come up with a good plan.


The pdfsizeopt.py problems we run into (at least issues 2 and 18) are reported
since 2009, and a fix is still missing.  No, I won't rely on Peter to enhance 
and fix
his tool fast.

ghostscript is the tool that does the main work, pdfsizeopt.py is an option.


cu,
 Knut
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 12:49, lemzw...@googlemail.com wrote:


The hyperlink issue is not related to the --bigpdf option (since it is a
bug in ghostscript), so I don't think that this is a showstopper.


Well, it means that the code currently cannot be used to build lilyponds
own documentation.

cu,
 Knut

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:

On 15.01.2015 10:45, mailto:d...@gnu.org wrote:

 Reliable?  If I remember correctly, the tool used for combining the
 fonts (ppdfsizeopt.py)



Ghostscript does the font merging.


Ok.


BTW: All this has been documented in the commit message of the

git-formatted

patch sent to lilypond-devel:



Internal hyperlinks are fully preserved with current ghostscript git

master.


External hyperlinks (GoToR) _to_ a file processed this way are broken.
Fixing this would require major changes to ghostscript.



External hyperlinks _from_ a file processed this way to other pdfs are
preserved if the reader program isn't broken (acroread is not broken
in this respect, evince is).



For more details see Ghostscript bug #695747
http://bugs.ghostscript.com/show_bug.cgi?id=695747#c22


If external hyperlinks from our documentation PDF to other files stop
working, we cannot make this the default way of building our
documentation.  The version of Ghostscript that is pertinent here is not
the development master but mainly the version used in GUB, and
secondarily the version of Ghostscript we expect to be current in
GNU/Linux or other distributions that build LilyPond natively.  There is
some settling-down time as they are unlikely to use any 2.19 version
(it's a development version, after all), but basically there needs to be
a reasonable chance of the Ghostscript versions being fine by the time
we release version 2.20.

https://codereview.appspot.com/194090043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 13:15, d...@gnu.org wrote:

On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:


Ghostscript does the font merging.


Any idea whether something could be done to make PDFTeX do the font
merging instead when including all the PDF files?


No, not really. That would require a lot of work.

cu,
 Knut


https://codereview.appspot.com/194090043/




___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread lemzwerg

Well, we get a large size reduction even if we don't use pdfsizeopt!
Using this program is an extra bonus but not mandatory.  And you are
right, I hope that Péter fixes the reported issues, provided someone is
going to add them to the bug tracker (which hasn't happened yet, looking
at https://code.google.com/p/pdfsizeopt/issues/list).

The hyperlink issue is not related to the --bigpdf option (since it is a
bug in ghostscript), so I don't think that this is a showstopper.

Regarding your b) issue: I fully agree.  Contacting Péter might be very
helpful.  Nevertheless, this takes time.  Given that it should be
straightforward to make --bigpdf a no-op in case it is no longer useful,
I still vote for incorporating the patch.


https://codereview.appspot.com/194090043/
___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:

On 15.01.2015 10:45, mailto:d...@gnu.org wrote:

 Reliable?  If I remember correctly, the tool used for combining the
 fonts (ppdfsizeopt.py)



Ghostscript does the font merging.


Any idea whether something could be done to make PDFTeX do the font
merging instead when including all the PDF files?

https://codereview.appspot.com/194090043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread James

On 15/01/15 13:18, Knut Petersen wrote:

On 15.01.2015 13:15, d...@gnu.org wrote:

On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:


Ghostscript does the font merging.


Any idea whether something could be done to make PDFTeX do the font
merging instead when including all the PDF files?


No, not really. That would require a lot of work.

cu,
 Knut


https://codereview.appspot.com/194090043/




So do I go ahead and continue helping Knut with this patch?

James

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread dak

On 2015/01/15 13:18:46, Knut_Petersen_t-online.de wrote:

On 15.01.2015 13:15, mailto:d...@gnu.org wrote:
 On 2015/01/15 12:01:55, Knut_Petersen_t-online.de wrote:

 Ghostscript does the font merging.

 Any idea whether something could be done to make PDFTeX do the font
 merging instead when including all the PDF files?



No, not really. That would require a lot of work.


Judging from the documentation, that should be the default (namely, when
\pdfinclusioncopyfonts is at its default value of 0 and we are talking
about Type1 fonts).  Cf
URL:http://tex.stackexchange.com/questions/136574/merging-duplicate-embedded-fonts#138726
for example.  So the question is what is keeping this from happening.
Maybe we need to call ps2pdf (when converting the fragments for
inclusion) with some particular options to keep the fonts in a mergeable
state?

https://codereview.appspot.com/194090043/

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel


Re: Reduce size of PDF files when inc. in *TeX docs (issue 194090043 by pkx1...@gmail.com)

2015-01-15 Thread Knut Petersen

On 15.01.2015 13:12, d...@gnu.org wrote:


If external hyperlinks from our documentation PDF to other files stop
working, we cannot make this the default way of building our
documentation. 


Indeed. Building lilypond with --bigpdfs enabled by default is a good
test for that code, nothing more, nothing less. It passes that test.

If you use pdftex, xetex, luatex or other TeX dialects that are able to
directly include pdfs produced by lilypond and if you use that feature
a lot, the --bigpdfs code will help you to reduce the file size of your
final document significantly.

Typical use would be a dissertation, a song book, etc.

cu,
 Knut

___
lilypond-devel mailing list
lilypond-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/lilypond-devel