Hi Mark

I have searched the iTextSharp for 
SimpleTextExtractingPdfContentStreamProcessor but couldn't find it. Can you 
help me where can I find it

Thanks a lot for your time

Regards
Zain

Date: Wed, 30 Jun 2010 14:43:31 -0700
From: [email protected]
To: [email protected]
Subject: Re: [iText-questions] Data mining with iTextSharp























No, but you can extract Actual Text with
its coordinates and figure out what to discard.

 

Note that there are two things that look
like text but are not characters-in-the-content-stream:

1) Images with text.  You need OCR.

2) Paths (lines) in the shape of
characters.  You need OCR.

 

In general, you need OCR.  Your specific
case may be able to use the PdfContentStreamProcessor et al.  Check out the 
source
for SimpleTextExtractingPdfContentStreamProcessor, and get coordinates as well.

 



--Mark Storer

  Senior Software Engineer

  Cardiff.com

 

import legalese.Disclaimer;

Disclaimer<Cardiff> DisCard = null;

 













From: Zain ul Abideen
[mailto:[email protected]] 

Sent: Wednesday, June 30, 2010
11:11 AM

To: [email protected]

Subject: [iText-questions] Data
mining with iTextSharp



 

Hello all,

Is it possible to extract text from specific region from pdf. What I mean is
can we define co-ordinates of a rectangle or through some other way and than
extract text from that specific region ?



Regards,

Zain







Hotmail: Trusted email with powerful SPAM protection. Sign up now.








No virus found in this incoming message.

Checked by AVG - www.avg.com

Version: 9.0.830 / Virus Database: 271.1.1/2968 - Release Date: 06/30/10 
05:24:00




    
                                          
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
https://signup.live.com/signup.aspx?id=60969
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to