That really depends on what you want to do. What is it that you want?
On Mon, Feb 17, 2014 at 12:25 PM, Bikash Gupta <[email protected]>wrote: > Ok...so UserId is not a good field for this combination, but if I want > User Clustering, what should be combination(just for > understanding)..... > > On Tue, Feb 18, 2014 at 1:44 AM, Ted Dunning <[email protected]> > wrote: > > On Mon, Feb 17, 2014 at 9:00 AM, Bikash Gupta <[email protected] > >wrote: > > > >> Let say I am clustering users, I am providing their profile data to > >> discover similarity between two user. > >> > >> So my input would be [UserId, Location, Age, Gender, Time Created ] > >> > >> Now if my UserId length is of minimum 10 characters which is > >> comparative very large number than other categorical data. > >> > > > > User id is not a good field for clustering. > > > > Location is fine if you want geo-graphical clsutering. > > > > Location + age + gender is fine for geo-demo-graphical clustering. > > > > Adding time created might give a tiny bit of insight. > > > > But these fields are not going to lead to great insights. > > > > -- > Thanks & Regards > Bikash Kumar Gupta >
