How fond of java are you?

http://lucene.apache.org/nutch/

is a full blown full-text search engine, it does however require
significant resources, and it will need to be running a memory
intensive process to respond to rquests.



On 9/5/07, Garl Grigsby <[EMAIL PROTECTED]> wrote:
> I am currently the administrator for a mailing list archive/search site
> for a number of internal-only mailing lists (about 20). The current
> search engine is based on HTDig and works "ok". By ok I mean it seems to
> work most of the time. Occasionally it fails to reindex no reason I can
> find, it is horrifically slow to index (~40hrs on a Dual Processor
> Opteron w/4GB of RAM), and the database that the search indices are
> stored in corrupts far to easily. To add to all of this development of
> HTDig seems to have stalled or died completely (not sure which).
>
> Due to a hardware failure on a box that was not being backed up (this
> was not my box), and a few personnel changes, I am now forced to rebuild
> the archive/search system from the ground up. What I am being given is
> access to the mbox files for each mailing list, and pretty much nothing
> else. I have no access to the admin functions on the mailing list
> server, nor can I get any changes made to its configuration.
>
> I am leaning toward using Mhonarc to create the archive. What I need
> suggestions on is a search engine. I am looking for something that can
> handle a fairly large archive of messages, say on the order of 100-150k
> messages, that can easily index only new messages, and that can search
> groups of messages(i.e. I would like it so that you can search across a
> selected group of mailing lists, all lists, or only a single list).  I'd
> also like something that used a standard DB as the backend (MySQL,
> Postgres, or something similar).
>
> Due to the nature of the lists, I cannot use an external search engine.
> Everything must be kept in house. The server I have to host this on is
> running RHEL 5 and Apache. I have complete control of this server, so I
> can make changes as I see fit (other than changing the OS).
>
> So does anybody have any suggestions on a search engine that they have
> used that seems to work well? Did I leave anything out? I see a kitchen
> sink in the corner I didn't mention, but.....
>
> -G
> _______________________________________________
> EUGLUG mailing list
> [email protected]
> http://www.euglug.org/mailman/listinfo/euglug
>
_______________________________________________
EUGLUG mailing list
[email protected]
http://www.euglug.org/mailman/listinfo/euglug

Reply via email to