Not that I know enough about PDF to contribute anything significant but out of 
curiosity (which-lo and behold- killed the cat) I tried C&P with PDF-Xchange 
Viewer
Into notepad and the text was extracted beautifully (while Acrobat X gives 
scrambled garbage). First few lines from Ambulo Report.pdf are below, so the 
info must be somewhere

Kind regards,

/Gerold

Patient Information
Name Demo - Hypertensive  Last Primary physician
Patient ID ID1
Date of birth
Height, 
Weight
Tuesday, October 10, 1972
170 cm, 78 kg
Interpreting 
physician
Statistical Overview
Start Time Tuesday, January 29, 2008, 17:40
Stop Time Wednesday, January 30, 2008, 17:10
Duration 23 Hours
Measurements 37 Total: 37 Included, 0 Excluded, 0 Events, 0 Errors
Complete (37 Included, 100%) Mean Difference between Awake and Asleep
Min Mean Max StdDev ∆ mmHg % drop
Systolic
118 164.4 195 28.0
Systolic
41.8 23 %
Diastolic
81 107.3 125 12.8
Diastolic
22.2 19 %
Pulse 72 82.3 92 5.3 Pulse 0.9 1 %
MAP 95 124.2 146 15.1 MAP 25.4 19 %
Systolic > 
140
70.3 %
Diastolic >

-----Ursprüngliche Nachricht-----
Von: mkl [mailto:m...@wir-sind-cool.org] 
Gesendet: Donnerstag, 31. Jänner 2013 13:41
An: itext-questions@lists.sourceforge.net
Betreff: [iText-questions] [SPAM] Re: Not able to read text from ItextShap

Kiran Ghadge,

Kiran Ghadge wrote
> I am using itextsharp for reading text from PDF file.
> I have attached sample project.
> Below is code snippet. But the I am not able to get text from page.

The code snippet in your message contained collected no text and printed no 
text. Thus, I assume, that code did not produce the output.

The code in the attached project, on the other hand, collects and outputs text 
from the accompanying PDF. It first does a funny conversion, though:

(Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default,
Encoding.UTF8, Encoding.Default.GetBytes(text))))

Such a conversion should not be necessary if the text from the PDF can be 
properly read.

Here is your actual problem, though: The PDF does not seem to contain the 
correct information for text extraction at all, just try to do it using Adobe 
Acrobat (which is quite good at text extraction), for me it returns assorted 
symbols only.

Therefore, I'm afraid for PDFs like the one given you either have to resort to 
a custom extraction routine with a very special byte to text conversion, or you 
have to use OCR.

Regards,   Michael



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/Not-able-to-read-text-from-ItextShap-tp4657491p4657496.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics Download AppDynamics Lite for free 
today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/ Please check the keywords list 
before you ask for examples: http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to