Hi Ryan I wanted to clear that the problem we are solving is essentially k-center one, right ? and not k-means. We already have an implementation of dual tree k-means. >From my understanding, the difference is that in k-centers, the center points or centroids are the points that necessarily belong to the dataset, and for k-means, it can be any point in the space. However, in the project description, it's written that we need to find k-points which may not be in the set. But that is essentially k-means problem. This got me a little confused, and I wanted to clarify what we are solving here.
Modifying the k-means problem to perform k-centers isn't much of a problem. What do you think ? Regards Abhinav On Tue, Mar 1, 2016 at 8:11 PM, Ryan Curtin <[email protected]> wrote: > On Tue, Mar 01, 2016 at 03:08:28PM +0530, Abhinav Agarwalla wrote: > > Hello > > > > I worked really hard on understanding dual tree algorithms, and mlpack's > > implementation of the same last year, but unfortunately we were not > > selected. I still wanted to do the project, but got busy. I am really > glad > > that this time we are indeed selected. > > I have gone through the literature, particularly N-body problems by A. > > Gray, and cover trees one. I am also going through the list, that was > > shared in an earlier post. > > I was hoping if either Bill or you could share the idea, and the work > that > > needs to be done. > > > > Considering that organisation slots are fixed and numerous projects, what > > is your priority for this particular project ? > > I also couldn't find any issues surrounding this. Is there any > > implementation that I can start with ? > > Hi Abhinav, > > I remember our discussions from last year; probably the best > documentation on this project at the moment are these emails, which you > may have already seen: > > https://mailman.cc.gatech.edu/pipermail/mlpack/2015-February/000610.html > https://mailman.cc.gatech.edu/pipermail/mlpack/2014-March/000330.html > > A good thing to do, if you are familiar with dual-tree algorithms now, > would be to study the k-centers problem and think about possibilities > for an appropriate BaseCase() and Score() function. This is a research > project, so there is not yet a known dual-tree algorithm to solve the > k-centers problem; there will be no reference implementation to consult. > > Another idea might be to go read about the Gonzalez algorithm for > k-centers and then implement it to see how it works. > > About the priority of the projects: all of the projects on the ideas > list are important---otherwise they wouldn't be there. :) It's not > possible to say at this point which projects will have slots allocated > to them and which will not; that depends on the applicants, the mentors, > and how many slots Google gives us this year. > > I hope this is helpful. Please let me know if I can clarify anything. > > Thanks! > > Ryan > > -- > Ryan Curtin | "Open the pig!" > [email protected] | - Frank Moses >
_______________________________________________ mlpack mailing list [email protected] https://mailman.cc.gatech.edu/mailman/listinfo/mlpack
