On 02/03/2012 05:33 PM, Ben Clay wrote:
Hi-
I am using Mean Shift clustering with good results. Mean Shift was
chosen because I don't know the number of clusters ahead of time, and
the number of samples is very small (<100) so performance is a non-issue.
Now I need to enforce an aging scheme, so that older samples influence
the clustering less than newer samples. My knowledge of clustering is
limited, but I'm looking for a way to weight the newer samples higher,
such that the algorithm tries harder to minimize their distance from
the cluster centers as compared to older samples.
From looking through scikit-learn, I don't see a way to weight input
samples with Mean Shift or any other clustering algorithm. Google
yielded several papers on the subject but they quickly went over my head.
Does anyone know of a way to do this, either with a scikit-learn
clustering class or otherwise? Since performance is not a concern,
I'd be open to any hacky solutions, such as multiple rounds of
clustering or filtering.
Thanks!
Hi Ben.
A simple hack that comes to my mind for weighting samples is replicating
samples. So if you want one sample
to have more weight, just put it in the training set two times.
I'm not sure what you mean by "newer samples".
Are you in an online setting where you get one sample at a time?
And what are you interested in? The evolution of clusters?
Afaik, the only clustering algorithm that supports iterative refinement
in sklearn
is minibatch K-Means.
Cheers,
Andy
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general