Re: help on converting pdf to text

Stephen York Thu, 03 Nov 2005 16:19:28 -0800

First off, realise that a pdf isn't just a marked up text document.
It's a wrapper for images and text, movies and many other formats.

If you have a text pdf, then the text is a postscript object cataloguedsomewhere within the pdf.I've never done this in perl, but there are many commercial utilitiesaround to do it, although it's usually an ocr process.There maybe some info on cpan about it, but if you cant' find anything,then look into first extracting the postscript content, and then there'sbound to be a postscript decoder available from somewhere. Not sure offhand though.







Jenny Chen wrote:

Hi Everyone,

Does anyone know how to convert a pdf file to text
file in Perl?  Thanks

Jenny




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: help on converting pdf to text

Reply via email to