Op 9/08/2012 11:09, Raja Narayanan schreef: > his is a correct method or wrong to finish my tasks.
You're mixing too many different concepts, therefore you're all wrong. When you convert tagged PDF to XML, you convert stuff that is inside a content stream to XML. However, you are assuming that fields like check boxes, text fields,... are part of the content stream. That assumption is wrong! Fields are referred to from the Catalog (or root) object of the PDF. They are listed in the /Fields array of the /AcroForm entry. They are abstract. A field as such doesn't have any 'shape' or 'appearance', unless the field is visualized using widget annotations. The objects you see on the page are not fields, but they are widget annotations representing those fields. Widget annotations are a special type of annotations, and as you should know, annotations aren't part of the content stream. You can't find them in the /Contents of a page, you have to look for them in the /Annots entry of the page dictionary. Once you have the annotation dictionary, you can find the coordinates of each field (in the form of a /Rect). I'm not sure how you're going to match the XML received from the content stream (an XML file that doesn't contain any coordinates) with the coordinates of the rectangles you obtain from the widget annotations. But that's probably what your employer pays you to find out. I'm very interested to hear how you'll achieve that. ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php