I've put together a kind of experimental site which indexes the javadoc of OSS java projects (well, plus the JDK).
http://www.searchmorph.com/
This is meant to solve the problem where a java developer knows something has been done before, but where, in what project - source forge? jakarta? eclipse? jboss?.
There are at least 2 somewhat unique things here. I use a custom analyzer ("JavadocAnalyzer") which I recently mentioned on this list in another context. With it searches for something like "thread pool" will match tokens like "SyncThreadPool" or "Sync_ThreadPool".
http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgId=1731360
There's also an AIM (AOL) IM bot running. You send it a query and it sends you back 5 URLs of matches - web search w/o a browser.
Also inside - it does query expansion so that query terms are checked against multiple fields (may be similar to what nutch does).
And I also use the MoreLikeThis query expansion code I wrote - from a results page you can find similar URLs to the hits you see. [BTW: this doesn't seem to have made it into the sandbox...]
http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]&msgId=1353138
The about page is here: http://www.searchmorph.com/weblog/index.php?id=7
And the "technology inside" page elaborates a bit more: http://www.searchmorph.com/weblog/index.php?id=3
I'm interested in feedback. Does it find matches you expect, and what other packages should I index?
thx, Dave
PS
Surely this has been done before - what's the "competition" - any other similar specialized search engines?
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]