There are literally hundreds and hundreds of algorithms for k-means alone. That isn't even counting clustering that doesn't optimize k-means figure of merit.
On Tue, Dec 4, 2012 at 5:05 PM, Dan Filimon <[email protected]>wrote: > On Tue, Dec 4, 2012 at 10:00 AM, Ted Dunning <[email protected]> > wrote: > > I didn't know about BFR at the time and I always tend to choose > simplicity > > in any case. > > > > The theoretical bounds for streaming k-means are also persuasive. The > other > > strong-ish candidate is k-means++, but it doesn't have the required > sketch > > architecture in the form that they have analyzed. > > > > BFR is a reasonable candidate for follow-on work, but we should drive to > > conclusion with the current algorithm first. > > We should definitely focus on this algorithm for now. I was just > surprised to find another one I hadn't known about. :) > > > On Mon, Dec 3, 2012 at 6:47 PM, Dan Filimon <[email protected] > > > > wrote: > >> > >> My question is... why did we pick streaming k-means in particular as > >> opposed to this algorithm. BFR seems like a decent candidate for the > >> mapper clustering and while it looks more complex (algorithmically) I > >> wonder how the clustering quality compares to streaming k-means? > >> > >> What are your thoughts on this? > > > > >
