That is awesome and very inspirational! Carrot2 looks very interesting. Wondering if anybody has a list of all the academic research projects using Lucene. The only other one that I know of is Striver - which uses a support vector machine to learn the ranking function: http://www.cs.cornell.edu/People/tj/career/
Aneesha > > For my own amusement I've indexed the Wikipedia and put up pages that: > - display search results > - cluster the results using Carrot2 (my first use of this) > - display similar pages using the entire text to re-query for similar > docs and > - display similar pages using the "more like this" algorithm (TBD is get > this into the sandbox, sorry for delays..) > > > You start off here to search: > > http://www.searchmorph.com/kat/wikipedia.jsp > > > And the weblog entry goes into a bit more detail: > > http://www.searchmorph.com/weblog/index.php?id=37 > > > > It's kinda fun to explore the Wikipedia by looking for pages similar to > other ones. > > Hope people find this useful... > > - Dave > > PS > I'm in the process of running the page rank algorithm (from > jung.sf.net) on most of the entries in the Wikipedia. It has taken over > 2 days so far.... > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]