RE: Plugin extracting text from docs (was: new spam using large images)

2009-07-01 Thread Rosenbaum, Larry M.
We can use antiword to render text from MSWord files, and unrtf to render text from RTF files. What is the best tool to render text from PDF files? (We are running Solaris 9) L -Original Message- From: Jonas Eckerman [mailto:jonas_li...@frukt.org] Sent: Wednesday, June 24, 2009

RE: Plugin extracting text from docs (was: new spam using large images)

2009-07-01 Thread Giampaolo Tomassoni
We can use antiword to render text from MSWord files, and unrtf to render text from RTF files. What is the best tool to render text from PDF files? (We are running Solaris 9) FWIK, antiword is the best tradeoff between speed and conversion quality. The best converter I know of, even for

Re: Plugin extracting text from docs (was: new spam using large images)

2009-06-25 Thread Matus UHLAR - fantomas
Jason Haar wrote: Speaking of image/rtf/word attachment spam; is there any work going on to standardize this so that the textual output of such attachments could be fed back into SA? On 24.06.09 19:33, Jonas Eckerman wrote: Just as a note: I'm currently working on a modular plugin for