Re: [iText-questions] Extracting text from PDF

2013-04-04 Thread Kevin Day
Nope. PDF isn't a structured format - extracting structured text is a very, very difficult challenge. If the files all follow a similar format, you may be able to use that knowledge to derive an algorithm that can do it (see the LocationAwareTextExtractionStrategy). There have been other posts a

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-10 Thread Thakur, Pramila
[mailto:lrose...@adobe.com] Sent: Tuesday, January 10, 2012 12:15 PM To: Post all your questions about iText here Subject: Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios) The iText book of course - http://itextpdf.com/book/. How can you be working

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-10 Thread Leonard Rosenthol
bout iText here' Subject: Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios) Hi Rosenthol, Which chapter in the book talks about it? I do not own a book as of now. But can I look at it online though? I am using iText2.1.3 jar in my program.

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-10 Thread Thakur, Pramila
, January 10, 2012 11:30 AM To: Post all your questions about iText here Subject: Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios) I am saying that you can use iText to extract the embedded PDFs. Examples in the book. Not sure what that has

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-10 Thread Leonard Rosenthol
onard Rosenthol [mailto:lrose...@adobe.com]<mailto:[mailto:lrose...@adobe.com]> Sent: Monday, January 09, 2012 6:28 PM To: Post here Subject: Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios) Extract each PDF from the Portfolio and then run your exi

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-10 Thread Thakur, Pramila
-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios) Extract each PDF from the Portfolio and then run your existing routine on it. From: "Thakur, Pramila" mailto:pramila_tha...@ontla.ola.org>> Reply-To: Post here mailto:itext-questions@lists.s

Re: [iText-questions] Extracting text from pdf that has multiple pdf's embedded in it (Portfolios)

2012-01-09 Thread Leonard Rosenthol
Extract each PDF from the Portfolio and then run your existing routine on it. From: "Thakur, Pramila" mailto:pramila_tha...@ontla.ola.org>> Reply-To: Post here mailto:itext-questions@lists.sourceforge.net>> Date: Mon, 9 Jan 2012 14:10:47 -0800 To: Post here mailto:itext-questions@lists.sourcefo