Hi,
I have been messing with Solr and the Apache HTTPD documents over the
past few months and may have finally produced something that might be
of use. You can view the not-entirely-ripe fruits-of-my-labor here:
http://www.tonybibbs.com/~arreyder/docsearch/
The work is done with a perl script that runs the documents through a
xslt and then pushes the transformed xml into Lucene via Solr. I have
done a bit of tuning in Solr to get the results decent, but much more
work is needed to get things perfect. I ended up breaking each HTTPD
document into many smaller documents. Each directive, or section became
its own Solr sub-document and is linked back to the main document via
the common portion of the URL.
Right now the simple web search only returns the URL and a Description
or Title depending on the type of result. Much more could be returned
though as I tried to match elements 1:1 from the httpd docs to the Solr
formated documents. The potential is there to do things like a Context
only search, or to just search all of the Examples. It is also very
easy to return whatever matching elements you wish (context, examples,
usage, summary, notes, etc...) from the Solr Documents that match your
search.
The results are fed through another xslt using Solr's built in response
writer to generate the xhtml that makes up the results page.
If anyone is interested in the details/scripts/Solr schema and config
files I used to get this far, let me know and I will make them available
somewhere. Just be nice when critiquing, I could barely spell XSLT when
I started this project and I still get it wrong now and then.
If you guys see any value in this I will be happy to keep plugging away
at it. It has been a great learning experience so far. I am at the
point though where I need some direction/guidance/testers to continue.
Thanks!
chris rhodes
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]