[iText-questions] [SPAM] Re: Not able to read text from ItextShap

mkl Thu, 31 Jan 2013 04:40:49 -0800

Kiran Ghadge,

Kiran Ghadge wrote
> I am using itextsharp for reading text from PDF file.
> I have attached sample project.
> Below is code snippet. But the I am not able to get text from page.

The code snippet in your message contained collected no text and printed no
text. Thus, I assume, that code did not produce the output.

The code in the attached project, on the other hand, collects and outputs
text from the accompanying PDF. It first does a funny conversion, though:

(Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default,
Encoding.UTF8, Encoding.Default.GetBytes(text))))

Such a conversion should not be necessary if the text from the PDF can be
properly read.

Here is your actual problem, though: The PDF does not seem to contain the
correct information for text extraction at all, just try to do it using
Adobe Acrobat (which is quite good at text extraction), for me it returns
assorted symbols only.

Therefore, I'm afraid for PDFs like the one given you either have to resort
to a custom extraction routine with a very special byte to text conversion,
or you have to use OCR.

Regards, Michael

--
View this message in context:
http://itext-general.2136553.n4.nabble.com/Not-able-to-read-text-from-ItextShap-tp4657491p4657496.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php

[iText-questions] [SPAM] Re: Not able to read text from ItextShap

Reply via email to