On May 11, 2007, at 3:21 PM, dayvidpow wrote:

> Is it REALLY that difficult to add Tagging to an existing PDF?

        Was my email not clear enough about the description of what tagging  
is?   Did it not give you a clear indication of what is required?  If  
there was something in my description that wasn't clear to you,  
please let me know what it is and I will be happy to go into more  
detail.


> Let me explain
> again what I am attempting to do -  I have an existing image-based PDF
> document into which I insert another page, hidden OCR text and  
> bookmarks
> using the PdfStamper class, however I also want the final output to  
> be a
> "Tagged" PDF file.
>

        Since you have all the text that is being inserted, that makes it  
easier - though still not trivial.

        First, you will need to analyze the text in order to determine the  
semantic nature of each piece of text.  For example, where does each  
paragraph start & end?   Is a given string a Header or Footer?  Is  
any text contained inside of a table - and if so, what part of the  
table?  Once you have a semantic map for your data, you are ready to  
proceed to step two.   (This part is, of course, outside iText)

        Second, as you add the text to the PDF you will need do the  
following for each :"semantic block"
                * Create an associated "Structure Element" inside of the 
"Structure  
Tree".
                * Surround each set of operators with appropriate "marked 
content"  
operators that point to Structure elements you created.
        I believe that iText APIs exist for some, but perhaps not all of the  
above - in which case, you may need to extend iText itself.

        Once you've done all of that, you can setup the global/document  
level MarkInfo dictionary.  Again, I am not sure if an API exists in  
iText for this or not.


Leonard


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/

Reply via email to