El Dijous, 17 de setembre de 2015, a les 11:49:27, Jason Crain va escriure: > On 2015-09-17 08:57, Leonard Rosenthol wrote: > > While it is unclear in ISO 32000-1 whether such a PDF is invalid, we > > made it clear in 32000-2 that you can only have one copy of each page > > in the Pages tree. So personally, I wouldn’t waste much time on this > > particular file. > > > > Leonard > > OK, if it's not allowed by the spec, I have no real objection to the > object count check.
Pushed. Cheers, Albert > > > On 9/17/15, 1:04 AM, "poppler on behalf of Jason Crain" > > <[email protected] on behalf of > > > > [email protected]> wrote: > >> On Wed, Sep 16, 2015 at 09:05:58PM -0400, William Bader wrote: > >>> > > I don't know of a good way to validate the page count. Even > >>> > > going through the page tree might be hard to do right without > >>> > > leading to an infinite loop, in addition to being slow. > >>> > > >>> > Catalog::cachePageTree goes over the tree, but i agree doing that > >>> > to calculate the num of pages can be meh. > >>> > >>> If the number of pages is huge, the PDF might be intentionally > >>> corrupted to provoke a bug in a particular PDF viewer, and other > >>> data structures could be subtly corrupted as well. Any scan would > >>> have to proceed very cautiously. > >>> > >>> If there is a minimum number of objects required for a page, and if > >>> the total number of objects is easy to find, could poppler > >>> immediately reject files with (total num objects) / (min objects per > >>> page) < page count? > >> > >> The document at > >> https://drive.google.com/open?id=0ByTyiZeyQ4p9cTVBUllNRmI3bmM is what > >> I'm thinking of. It has 5 objects and a single page that is listed in > >> the /Kids array 10 times. Duplicating the page just means adding it > >> to the array again and incrementing /Count. If we want this document > >> to work then there's really no minimum number of objects required for > >> a page. Otherwise, each page would require at least a /Page object. > >> > >> FWIW Adobe Reader shows an error on the document after the first > >> duplicated page. Other viewers show it just fine. > > _______________________________________________ > poppler mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/poppler _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
