Hi,

I have been messing with Solr and the Apache HTTPD documents over the past few months and may have finally produced something that might be of use. You can view the not-entirely-ripe fruits-of-my-labor here: http://www.tonybibbs.com/~arreyder/docsearch/ The work is done with a perl script that runs the documents through a xslt and then pushes the transformed xml into Lucene via Solr. I have done a bit of tuning in Solr to get the results decent, but much more work is needed to get things perfect. I ended up breaking each HTTPD document into many smaller documents. Each directive, or section became its own Solr sub-document and is linked back to the main document via the common portion of the URL.

Right now the simple web search only returns the URL and a Description or Title depending on the type of result. Much more could be returned though as I tried to match elements 1:1 from the httpd docs to the Solr formated documents. The potential is there to do things like a Context only search, or to just search all of the Examples. It is also very easy to return whatever matching elements you wish (context, examples, usage, summary, notes, etc...) from the Solr Documents that match your search.

The results are fed through another xslt using Solr's built in response writer to generate the xhtml that makes up the results page.

If anyone is interested in the details/scripts/Solr schema and config files I used to get this far, let me know and I will make them available somewhere. Just be nice when critiquing, I could barely spell XSLT when I started this project and I still get it wrong now and then.

If you guys see any value in this I will be happy to keep plugging away at it. It has been a great learning experience so far. I am at the point though where I need some direction/guidance/testers to continue.

Thanks!

chris rhodes
[EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to