Categorical Features for K-Means Clustering

2014-07-11 Thread Wen Phan
Hi Folks, Does any one have experience or recommendations on incorporating categorical features (attributes) into k-means clustering in Spark? In other words, I want to cluster on a set of attributes that include categorical variables. I know I could probably implement some custom code to

Re: Categorical Features for K-Means Clustering

2014-07-11 Thread Wen Phan
, depending on your use case. So a dimension that takes on 3 categorical values, becomes 3 dimensions, of which all are 0 except one that has value 1. On Fri, Jul 11, 2014 at 3:07 PM, Wen Phan wen.p...@mac.com wrote: Hi Folks, Does any one have experience or recommendations on incorporating