SenseClusters (http://senseclusters.sourceforge.net) participated in Senseval-4/Semeval1, in the sense induction task. Additional information about the task (including the data) can be found here: http://ixa2.si.ehu.es/semeval-senseinduction/
For this task I used the currently released version of SenseClusters (0.95) with fairly standard settings. The short summary of my approach is that I used second order context vectors, bigrams selected using pmi with a window size of 12, no svd, k-means clustering, and cluster stopping with the adapted gap statistic. I did not tune these settings on any data, and in fact, the results I submitted to the task were based on the second or third run I did, where the only thing I varied was the frequency cutoff for feature identification (my initial settings were too high I think, resulting in few or no features being selected for some words). More detailed discussion can be found here: UMND2 : SenseClusters Applied to the Sense Induction Task of Senseval-4 (Pedersen) - Appears in the Proceedings of SemEval-2007: 4th International Workshop on Semantic Evaluations, June 23-24, 2007, Prague, Czech Republic. http://www.d.umn.edu/~tpederse/Pubs/umnd2-sval4.pdf This paper includes a description of the system as well as some discussion of unsupervised evaluation techniques for discrimination / disambiguation systems, which I think remains a very open and interesting area of work. Enjoy, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
