Charles Hart Enzer, M.D. wrote:
> What tools do you suggest for:
> 
>    Text only
> 
>    Text and images

This may or may not help. On Linux, there are some command-line tools 
such as:

pdftotext
pdftops
pdftohtml
pdfimages

They do the job, with quality about what you'd expect from a tool that 
has to work backwards. pdftops (to postscript) tends to work the best, 
since postscript and pdf are related.

pdftotext just gets the raw text with an attempt at formatting blocks of 
text that tends to miss the mark by a lot. But it gives good output for, 
e.g. indexing.

I'm sure there are GUI tools built on top of these CLI tools, and I'm 
sure there are other solutions available on Windows, but I thought I'd 
mention these anyway. I've used them for years with good results.

Paul


_______________________________________________
Post Messages to: [email protected]
Subscription Maintenance: http://leafe.com/mailman/listinfo/profox
OT-free version of this list: http://leafe.com/mailman/listinfo/profoxtech
Searchable Archive: http://leafe.com/archives/search/profox
This message: http://leafe.com/archives/byMID/profox/[EMAIL PROTECTED]
** All postings, unless explicitly stated otherwise, are the opinions of the 
author, and do not constitute legal or medical advice. This statement is added 
to the messages for those lawyers who are too stupid to see the obvious.

Reply via email to