From: "Peter Karman" <[EMAIL PROTECTED]>
Do you know if it can index the html documents without parsing them with
other tools, or possibly other type of files like pdf, doc?


Xapian is a library. The related Omega project has support for parsing docs of various formats.

Oh yes, Omega seems to be nice. Too bad it doesn't allow indexing the auto-generated web pages, but only the static content.

Do you have a recommendation for a good perl module that can be used easyly for creating a spider that should index a web site?

Octavian

_______________________________________________
List: [email protected]
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/[EMAIL PROTECTED]/
Dev site: http://dev.catalyst.perl.org/

Reply via email to