adriano, adriano wrote > is there a way to retrieve/list all indirect objects found a PDF, > regardless of the information contained in the xref table? I mean, without > relying on the Xref Table to be correct. > > The final goal is to detect duplicate indirect objects (that is, objects > with non-uniqe IDs) that would invalidate the compliance of a document > structure to the PDF specs.
Unfortunately, if you ignore the cross reference tables or streams, interpretation of where objects start in a PDF can be ambiguous. E.g. look at the woes in PDF, A broken Spec! <http://feliam.wordpress.com/2010/08/14/pdf-a-broken-spec/> Here the author describes that --- without cross references --- the extent of a stream may be unclear which can result in some end fragment of it either being a PDF object in its own right or just a part of the stream. (The author goes a bit too far in his conclusions, though, as the cross references are the prime source on where which object starts. Leonard also added a remark to that effect tto the article.) The possible existence of object streams may even aggrevate this problem somewhat. That being said, you can of course try and parse a PDF file from front to back without taking cross references into account; this actually is what 'repairing' a broken PDF mostly is about. But deriving some assertion about the specification conformance of a PDF from this technique is dubious at least. Regards, Michael -- View this message in context: http://itext-general.2136553.n4.nabble.com/Duplicate-indirect-objects-tp4657759p4657770.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php