Thank you !

I understand a lot better now.

For the clustering I should write my own distance measure class. Not try to
give numerical values to colors.

2011/8/18 Ted Dunning <[email protected]>

> Just the opposite.  Frequent itemset would discover groups of tv channels
> and colors that occur together.  That might be slightly interesting, but
> probably not so useful.
>
> For that you want clustering, but you will have to decide how similar
> colors
> are.  You might just say that if they are the same, distance is 0 while
> different means distance 1.
>
> Or you could do SVD first and then cluster (that is spectral clustering,
> ish).
>
> 2011/8/18 Clément Notin <[email protected]>
>
> > Ok thanks !
> >
> > So if I want to discover groups of customers based on, for example, their
> > favorite color, their favorite TV channel and the brand of their cellular
> > phone (it's an example...) should I use frequent itemset mining instead
> of
> > clustering ?
> >
> > 2011/8/17 Ted Dunning <[email protected]>
> >
> > > Both clustering and frequent itemset algorithms are unsupervised
> learning
> > > methods.
> > >
> > > Clustering uses your definition of near and far to find (hopefully)
> > clumps
> > > of data.
> > >
> > > Frequent item-set analysis looks for cases where items cooccur.  The
> > origin
> > > is in what is called market-basket analysis where the goal was to find
> > > items
> > > that are commonly purchased together.
> > >
> > > For most purposes, I recommend simple cooccurrence analysis.
> > >
> > > I think that your confusion stems from you telling the frequent itemset
> > > code
> > > to find item characteristics that often occur together on the same
> item.
> > >  That probably isn't what you want.
> > >
> > > 2011/8/17 Clément Notin <[email protected]>
> > >
> > > > Hello Mahout !
> > > >
> > > > I'm unable to find the answer (trust me, I tried !) of a simple
> > question
> > > :
> > > > what is the difference between clustering and frequent itemset mining
> ?
> > > >
> > > > I think that frequent itemset mining could help me to cluster things
> > > based
> > > > on colors or other non-numerical characteristics. I thought about
> > > > converting
> > > > these values to numbers but it don't always make sense (what order
> > should
> > > I
> > > > use ? blue is near purple ok so blue = 1 and purple = 2 but is these
> > car,
> > > > for example, near that one ?).
> > > >
> > > > Thanks for reading.
> > > > Regards,
> > > >
> > > > --
> > > > *Clément **Notin*
> > > >
> > >
> >
> >
> >
> > --
> > *Clément **Notin*
> >
>



-- 
*Clément **Notin*

Reply via email to