Ok...so UserId is not a good field for this combination, but if I want User Clustering, what should be combination(just for understanding).....
On Tue, Feb 18, 2014 at 1:44 AM, Ted Dunning <[email protected]> wrote: > On Mon, Feb 17, 2014 at 9:00 AM, Bikash Gupta <[email protected]>wrote: > >> Let say I am clustering users, I am providing their profile data to >> discover similarity between two user. >> >> So my input would be [UserId, Location, Age, Gender, Time Created ] >> >> Now if my UserId length is of minimum 10 characters which is >> comparative very large number than other categorical data. >> > > User id is not a good field for clustering. > > Location is fine if you want geo-graphical clsutering. > > Location + age + gender is fine for geo-demo-graphical clustering. > > Adding time created might give a tiny bit of insight. > > But these fields are not going to lead to great insights. -- Thanks & Regards Bikash Kumar Gupta
