Thank you ! I understand a lot better now.
For the clustering I should write my own distance measure class. Not try to give numerical values to colors. 2011/8/18 Ted Dunning <[email protected]> > Just the opposite. Frequent itemset would discover groups of tv channels > and colors that occur together. That might be slightly interesting, but > probably not so useful. > > For that you want clustering, but you will have to decide how similar > colors > are. You might just say that if they are the same, distance is 0 while > different means distance 1. > > Or you could do SVD first and then cluster (that is spectral clustering, > ish). > > 2011/8/18 Clément Notin <[email protected]> > > > Ok thanks ! > > > > So if I want to discover groups of customers based on, for example, their > > favorite color, their favorite TV channel and the brand of their cellular > > phone (it's an example...) should I use frequent itemset mining instead > of > > clustering ? > > > > 2011/8/17 Ted Dunning <[email protected]> > > > > > Both clustering and frequent itemset algorithms are unsupervised > learning > > > methods. > > > > > > Clustering uses your definition of near and far to find (hopefully) > > clumps > > > of data. > > > > > > Frequent item-set analysis looks for cases where items cooccur. The > > origin > > > is in what is called market-basket analysis where the goal was to find > > > items > > > that are commonly purchased together. > > > > > > For most purposes, I recommend simple cooccurrence analysis. > > > > > > I think that your confusion stems from you telling the frequent itemset > > > code > > > to find item characteristics that often occur together on the same > item. > > > That probably isn't what you want. > > > > > > 2011/8/17 Clément Notin <[email protected]> > > > > > > > Hello Mahout ! > > > > > > > > I'm unable to find the answer (trust me, I tried !) of a simple > > question > > > : > > > > what is the difference between clustering and frequent itemset mining > ? > > > > > > > > I think that frequent itemset mining could help me to cluster things > > > based > > > > on colors or other non-numerical characteristics. I thought about > > > > converting > > > > these values to numbers but it don't always make sense (what order > > should > > > I > > > > use ? blue is near purple ok so blue = 1 and purple = 2 but is these > > car, > > > > for example, near that one ?). > > > > > > > > Thanks for reading. > > > > Regards, > > > > > > > > -- > > > > *Clément **Notin* > > > > > > > > > > > > > > > -- > > *Clément **Notin* > > > -- *Clément **Notin*
