For my own amusement I've indexed the Wikipedia and put up pages that:
- display search results
- cluster the results using Carrot2 (my first use of this)
- display similar pages using the entire text to re-query for similar docs and
- display similar pages using the "more like this" algorithm (TBD is get this into the sandbox, sorry for delays..)



You start off here to search:

        http://www.searchmorph.com/kat/wikipedia.jsp


And the weblog entry goes into a bit more detail:

        http://www.searchmorph.com/weblog/index.php?id=37



It's kinda fun to explore the Wikipedia by looking for pages similar to other ones.

Hope people find this useful...

- Dave

PS
I'm in the process of running the page rank algorithm (from jung.sf.net) on most of the entries in the Wikipedia. It has taken over 2 days so far....


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to