This depends on the PDF,

is the PDF Tagged? Then you might be able to find out what's the title and heading. If it's not tagged good luck with guessing the title and heading from the text found in the document.

On 24/06/2011 14:10, modie wrote:
Hi,

Sorry, I am new to iTextSharp and cannot find documentation for it anyway,
other than this forum. I am looking to extract content from a PDF document,
but I need to be able to understand the structure / markup in the document.

I want to extract the heading / title for the document which would generally
found on the first page. Any ideas how I would do this? In html I would look
for the h1 or h2 tag?

PS - no, I dont want the title property of the document


--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/How-to-extract-title-heading-from-document-contents-tp3622357p3622357.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense..
http://p.sf.net/sfu/splunk-d2d-c1
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

--
@redlabbe <http://twitter.com/redlabbe>
redlab-log <http://www.redlab.be/blog>
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a 
definitive record of customers, application performance, security 
threats, fraudulent activity and more. Splunk takes this data and makes 
sense of it. Business sense. IT sense. Common sense.. 
http://p.sf.net/sfu/splunk-d2d-c1
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to