ht://dig is a full text indexer. It will follow links and index all the content it can understand. for pdf and doc files, there are plug-ins that allow the indexing.
others will probably give you more details on these
--Hi, I currently have a site built through Lotus Domino pages. I wish to implement a search engine and was going through the relative metits of existing products.What I would like to know about your search engine software is whether or not it offers full text indexing of not only the META tags, but also the content of the page and any attachments that may be linked with the page. It is quite important for full-text indexing of these attachments (mainly in the form of PDF's and Word documents), as a big portion of the site's content will come from these files. Any information that you could provide me on your product will be much appreciated.
Franck Horlaville,
Technical Director - Athena Online s.a.
Web site creation and hosting
--
<http://www.athena.online.co.ma/>
-------------------------------------------------------
This sf.net email is sponsored by:
With Great Power, Comes Great Responsibility Learn to use your power at OSDN's High Performance Computing Channel
http://hpc.devchannel.org/
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html