Re: [htdig] PDF text search

Gilles Detillieux Thu, 06 Dec 2001 13:15:21 -0800

According to Brett Simpson:
> I'm currently using the RPM version of Htdig 3.2.0-1.b3.6 on Redhat 7.2
> with apache. What do I need to perform text searching of pdf files? If
> I copy a pdf file into /var/www/html and run "htdig -iv" it lists the
> pdf file as not Parsable. I am able to do a search with the default
> stuff that comes loaded in /var/www/html. Do I need to add some sort
> of external parser? Does anyone know of any? Thanks.


I recommend doc2html.pl.  See http://www.htdig.org/FAQ.html#q4.9

In any case, htdig 3.2.0b3 is pretty buggy, so I'd recommend getting
the latest update rpms from Red Hat or a local mirror site, i.e.

apache-1.3.22-2
htdig-3.2.0-1.b4.0.72
htdig-web-3.2.0-1.b4.0.72

Red Hat users should, as a rule, keep up to date on the latest errata
at http://www.redhat.com/apps/support/errata/

It helps if you subscribe to their redhat-watch mailing list, so you
get the advisories by e-mail when the updates come out.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Re: [htdig] PDF text search

Reply via email to