Re: [R] Very Slow Gower Similarity Function

Tyler Smith Mon, 18 Apr 2005 13:25:45 -0700

Quoting Martin Maechler <[EMAIL PROTECTED]>:

> I don't know what exactly you want.


The Gower coefficient I am referring to comes from his 1971 article in
Biometrics (27(4):857-871). It differs from most commonly used measures (but
not, apparently, daisy!) by allowing the incorporation of quantitative and
qualitative (binary or unordered multistate characters) variables, and also by
providing a mechanism for dropping missing values from similarity calculations.
This is also covered in Legendre and Legendre.

>
> The function  daisy() in the recommended package "cluster"
> has always worked with missing values and IIRC, the book
> "Kaufman & Rousseeuw" {which I have not at hand here at home},
> clearly mentions Gower's origin of their distance measure
> definition.

I was unaware of the daisy function. Looking over it now it differs from the
Gower coefficient primarily in the method of standardization. Gower
standardized each variable by dividing it by it's range ("ranging"), where
daisy does a more conventional standardization (-mean and /SD). As I understand
it, there isn't much to recommend standardizing over ranging (or vice versa) so
daisy may provide a useful alternative for my project. I'll have to look into
it!

Thanks,

Tyler

>
> Martin Maechler, maintainer of cluster package,
> ETH Zurich
>

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Very Slow Gower Similarity Function

Reply via email to