Hi Mark
I have searched the iTextSharp for
SimpleTextExtractingPdfContentStreamProcessor but couldn't find it. Can you
help me where can I find it
Thanks a lot for your time
Regards
Zain
Date: Wed, 30 Jun 2010 14:43:31 -0700
From: [email protected]
To: [email protected]
Subject: Re: [iText-questions] Data mining with iTextSharp
No, but you can extract Actual Text with
its coordinates and figure out what to discard.
Note that there are two things that look
like text but are not characters-in-the-content-stream:
1) Images with text. You need OCR.
2) Paths (lines) in the shape of
characters. You need OCR.
In general, you need OCR. Your specific
case may be able to use the PdfContentStreamProcessor et al. Check out the
source
for SimpleTextExtractingPdfContentStreamProcessor, and get coordinates as well.
--Mark Storer
Senior Software Engineer
Cardiff.com
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
From: Zain ul Abideen
[mailto:[email protected]]
Sent: Wednesday, June 30, 2010
11:11 AM
To: [email protected]
Subject: [iText-questions] Data
mining with iTextSharp
Hello all,
Is it possible to extract text from specific region from pdf. What I mean is
can we define co-ordinates of a rectangle or through some other way and than
extract text from that specific region ?
Regards,
Zain
Hotmail: Trusted email with powerful SPAM protection. Sign up now.
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.830 / Virus Database: 271.1.1/2968 - Release Date: 06/30/10
05:24:00
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions:
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/