I have seen a program named pdftotex that can extract the text from .pdf
files, and that program can be used in a perl program for extracting the
text from more .pdf files.
Search with Google for it.

I have seen that it can extract the text even from some pdf files that have
a copy protection set, but the text is not always accurate. Sometimes some
spaces are missing and sometimes some special chars are badly converted.

BTW. I have seen that Adobe Acrobat Reader (some versions) can export the
pdf file as text. I think that perl might be able to communicate with some
.dll files of Acrobat Reader in order to extract the text, but I don't know
how.

Teddy

----- Original Message ----- 
From: "Jenny Chen" <[EMAIL PROTECTED]>
To: "Perl List" <[email protected]>
Sent: Friday, November 04, 2005 1:07 AM
Subject: help on converting pdf to text


>
> Hi Everyone,
>
> Does anyone know how to convert a pdf file to text
> file in Perl?  Thanks
>
> Jenny
>
> -- 
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> <http://learn.perl.org/> <http://learn.perl.org/first-response>
>
>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to