> > wonder if it can handle mbox.
>
> IIRC , mailman stores the archives as plain old HTML.
>
> Then they could argue that you could use an arbitrary search engine,
> so no need to write a custom one. And they are right--I almost always
> use google rather than a particular search.
>
> So if we can publish the archives as static HTML, we are all set


Just to clarify, neither Google Custom Search nor Solr / Lucene constitute
"writing a custom [search engine]".  ;-)

Google Custom Search is how sites (that are not Google Enterprise
customers) set up to have Google produce search results for their site.
It's a filter on normal Google search results, quite easy to set up.

https://www.google.com/cse/docs/

Lucene (document indexer and search query processor), Solr (web front end
for Lucene), Nutch (web spider that can feed documents to Lucene).  Works
out of the box without too much hassle.  There are several Python
connectors / APIs (pylucene, sunburnt -- my student chose the latter for
his project).

http://lucene.apache.org/
http://lucene.apache.org/solr/
http://nutch.apache.org/

-- Pat

Reply via email to