hi,

try to use etymon parser to read a pdf file... it works fine to extract
text from a pdf file... also u will get examples along with jar files.

www.etymon.com/pj


urs,
t.jayakumar
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, February 19, 2002 2:04 PM
To: sebastian
Cc: [EMAIL PROTECTED]
Subject: [iText-questions] Re: one question


sebastian writes:

> im sorry to write directly to you but somehow i cant send a mail to
sourceforge and to the mail list.
> my question is very simple, i just want to read the text inside a pdf
file. nothing but the text, no fonts or something like it, just the text
and transform it into a java string.
> is it posible with itext??? how can i do?? whom should i ask?

When you generate a Pdf file, text and graphics are
'painted' on a canvas. This 'painting' can happen in
a very random way. 

You can have the sentence: Quick brown fox jumps over the lazy dog,
painted in this order:     abcde efghi jkl mnooo opQr rst uuvw xyz
So regardless of using iText, it can be very difficult to retrieve
text from a PDF file. 

In iText there is a class PdfReader that reads PRStream-
and PRString-objects and I have had a mail once from somebody
who managed to use these objects to retrieve some text.
You could try this yourself, but I don't think any of the
iText developers will provide this functionality in a
standard way, because we consider PDF to be a ReadOnly format. 

kind regards,
Bruno Lowagie

_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

_______________________________________________
iText-questions mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/itext-questions

Reply via email to