Op 9/08/2012 11:09, Raja Narayanan schreef:
> his is a correct method or wrong to finish my tasks.

You're mixing too many different concepts, therefore you're all wrong.

When you convert tagged PDF to XML, you convert stuff that is inside a 
content stream to XML. However, you are assuming that fields like check 
boxes, text fields,... are part of the content stream. That assumption 
is wrong!

Fields are referred to from the Catalog (or root) object of the PDF. 
They are listed in the /Fields array of the /AcroForm entry. They are 
abstract. A field as such doesn't have any 'shape' or 'appearance', 
unless the field is visualized using widget annotations.

The objects you see on the page are not fields, but they are widget 
annotations representing those fields. Widget annotations are a special 
type of annotations, and as you should know, annotations aren't part of 
the content stream. You can't find them in the /Contents of a page, you 
have to look for them in the /Annots entry of the page dictionary.

Once you have the annotation dictionary, you can find the coordinates of 
each field (in the form of a /Rect).

I'm not sure how you're going to match the XML received from the content 
stream (an XML file that doesn't contain any coordinates) with the 
coordinates of the rectangles you obtain from the widget annotations.

But that's probably what your employer pays you to find out. I'm very 
interested to hear how you'll achieve that.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to