[bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf files

Keith Marshall Tue, 12 Oct 2021 11:39:29 -0700

Follow-up Comment #1, bug #55107 (project groff):

In this mailing-list message
<https://lists.nongnu.org/archive/html/groff/2021-09/msg00064.html>[1], Deri
<https://savannah.gnu.org/users/deri> offered two PDF files, namely
Picture.pdf
<https://lists.nongnu.org/archive/html/groff/2021-09/pdf7tyGN4NLTE.pdf>[2] and
croptest.pdf
<https://lists.nongnu.org/archive/html/groff/2021-09/pdfBjudbNbwI2.pdf>[3],
from which the original prototype code
<https://osdn.net/users/keith/pf/groff-psbb/scm/tree/e25e11c6770a3d7a2e98cbcfce66dbffd7d8b5a0/>[4],
as referenced on this ticket, is unable to extract any valid MediaBox
specification.


In this follow-up message
<https://lists.nongnu.org/archive/html/groff/2021-10/msg00043.html>[5], I
explained that the failure to extract the MediaBox from Picture.pdf was caused
by an omission from the groff-psbb lexer's pattern matching rules for the PDF
dictionary scanning state, resulting in mishandling of nested dictionaries;
this is readily resolved by the [file #52093 attached patch][6].

OTOH, croptest.pdf uses new PDF (post PDF-1.5) features, and lacks any trailer
dictionary, or free-standing cross reference table, (both of which are
_required_ by the current groff-psbb prototype implementation); to support
these new PDF features, substantial additions to the current implementation
will be required.

[1]: https://lists.nongnu.org/archive/html/groff/2021-09/msg00064.html
[2]: https://lists.nongnu.org/archive/html/groff/2021-09/pdf7tyGN4NLTE.pdf
[3]: https://lists.nongnu.org/archive/html/groff/2021-09/pdfBjudbNbwI2.pdf
[4]:
https://osdn.net/users/keith/pf/groff-psbb/scm/tree/e25e11c6770a3d7a2e98cbcfce66dbffd7d8b5a0/
[5]: https://lists.nongnu.org/archive/html/groff/2021-10/msg00043.html
[6]: [file #52093 patch file #52093]

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?55107>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[bug #55107] PDFPIC: .psbb: support extraction of MediaBox from pdf files

Reply via email to