I have seen a program named pdftotex that can extract the text from .pdf files, and that program can be used in a perl program for extracting the text from more .pdf files. Search with Google for it.
I have seen that it can extract the text even from some pdf files that have a copy protection set, but the text is not always accurate. Sometimes some spaces are missing and sometimes some special chars are badly converted. BTW. I have seen that Adobe Acrobat Reader (some versions) can export the pdf file as text. I think that perl might be able to communicate with some .dll files of Acrobat Reader in order to extract the text, but I don't know how. Teddy ----- Original Message ----- From: "Jenny Chen" <[EMAIL PROTECTED]> To: "Perl List" <[email protected]> Sent: Friday, November 04, 2005 1:07 AM Subject: help on converting pdf to text > > Hi Everyone, > > Does anyone know how to convert a pdf file to text > file in Perl? Thanks > > Jenny > > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > <http://learn.perl.org/> <http://learn.perl.org/first-response> > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>
