> > wonder if it can handle mbox. > > IIRC , mailman stores the archives as plain old HTML. > > Then they could argue that you could use an arbitrary search engine, > so no need to write a custom one. And they are right--I almost always > use google rather than a particular search. > > So if we can publish the archives as static HTML, we are all set
Just to clarify, neither Google Custom Search nor Solr / Lucene constitute "writing a custom [search engine]". ;-) Google Custom Search is how sites (that are not Google Enterprise customers) set up to have Google produce search results for their site. It's a filter on normal Google search results, quite easy to set up. https://www.google.com/cse/docs/ Lucene (document indexer and search query processor), Solr (web front end for Lucene), Nutch (web spider that can feed documents to Lucene). Works out of the box without too much hassle. There are several Python connectors / APIs (pylucene, sunburnt -- my student chose the latter for his project). http://lucene.apache.org/ http://lucene.apache.org/solr/ http://nutch.apache.org/ -- Pat