Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-22 Thread Matt Shotwell
Martin, Pardon the delayed reply. Bootstrap methods have been around for some time (late seventies?), but their popularity seems to have exploded in correspondence with computing technology. You should be able to find more information in most modern books on statistical inference, but here is a

[R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Martin Tomko
Dear all, I am having a hard time to figure out a suitable test for the match between two nominal classifications of the same set of data. I have used hierarchical clustering with multiple methods (ward, k-means,...) to classify my dat into a set number of classesa, and I would like to compare

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Matt Shotwell
There are several statistics used to compare nominal classifications, or _partitions_ of a data set. A partition isn't quite the same in this context because partitioned data are not restricted to a fixed number of classes. However, the statistics used to compare partitions should also work for

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Martin Tomko
Thanks Mat, I have in the meantime identified the Rand index, but not the others. I will also have a look at profdpm, that did not pop-up in my searches. Indeed, the interpretation is going to be critical... Could you please elaborate on what you mean by the bootstrap process? Thanks a lot

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Marc Schwartz
On Nov 17, 2010, at 7:33 AM, Martin Tomko wrote: Dear all, I am having a hard time to figure out a suitable test for the match between two nominal classifications of the same set of data. I have used hierarchical clustering with multiple methods (ward, k-means,...) to classify my dat into

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Mattia Prosperi
Another useful measure to compare partitions is the adjusted Rand index which is implemented in the library(e1071) within the classAgreement function. If you have your data partitions to be compared in a matricial form (where each column is a different partition), the syntax is

Re: [R] statistical test for comparison of two classifications (nominal)

2010-11-17 Thread Martin Tomko
Thank you Matta for the great suggestion, I will try the additional tests. I have just been experimenting with the e1071 package and the adjustedRand. It works perfectly, The only outstadning question is interpretation - is there any rule of thumbs for the level of agreement that needs to be