Hi, Now that the accessibility code has been re-architectured and does no longer have feature limitations, we would like to work on improving the size of accessible PDF files.
The structure tree of a document is stored in the form of PDF objects. Contrary to the content presentation (the drawing commands stored in the page streams), it is not compressed and quickly causes the file size to blow up. On a typical 20-page document the structure tree can take up to 70% of the file size. We would like to implement PDF Object Streams as defined in the PDF 1.5 Reference. In short, the structure tree would be stored inside a stream to allow for compression in the same way as the page content. This has implications on how to refer to those objects, which can no longer be done using the classic cross-reference table. For that we need to implement the new cross-reference stream defined in the 1.5 spec. The compression of the structure tree would be optional and enabled only when 1.5 has been selected as the minimum PDF version. Like for the accessibility re-architecturing we will work on a branch and launch a vote once we feel it’s ready to be merged to Trunk. Any comments welcome. Thanks, Vincent