Hey Taylor. Currently I'd say +1 and add it to the neural networks folder as a reference implementation. Not sure what the other think.
For the visualization, there are some examples of 2d visualization where I guess the SOM would be a natural comparison: http://scikit-learn.org/dev/auto_examples/index.html#manifold-learning In particular the sphere example is pretty close to the one in elements of statistical learning: http://scikit-learn.org/dev/auto_examples/manifold/plot_manifold_sphere.html I also quite like this one: http://scikit-learn.org/dev/auto_examples/manifold/plot_lle_digits.html Isn't the toroidal topology a bit detrimental for visualization? I guess we will only find out by comparing. Please be aware that we have pretty high coding and documentation standards, and would like to have a paragraph for the narrative documentation. If you have any example where the SOM gave some interesting insight into the dataset (apart from 2d visualization of the grid), preferably with data included in sklearn or synthetic data, that would also be nice. Cheers, Andy On 10/21/2013 11:44 PM, Taylor Sather wrote: > Hey Andy, > > Haha, I really appreciate the quick and candid response! In all honesty, I > liked them less and less as I applied it to problems in the wild :) > > I was intrigued by the idea of extremely large grids (> 2k neurons) and the > "emergent" behavior that the som would exhibit under theses circumstances. > Unfortunately, figuring out if and when it converges and choosing the > clusters in the resulting grid isn't trivial. > > My implementation was mostly vanilla as you say. I haven't tried the batch > version because it seemed to throw out a lot of the cool emergent behavior. > > Currently It has the option of cosine or euclidean distance metrics. The > grid is 2d and toroidal so the edges loop around. > > I'm not sure at this point about performance relative to the methods in > sklearn (need to test this), but the visualization aspects are appealing. > The U-matrix and component planes are a neat way to help investigate data. > > Thanks, > Taylor > > PS - I'm not too excited about them either at this point, but I figured since > I already wrote a bunch of code⦠what the heck :) > > > > On Oct 21, 2013, at 11:07 PM, Andreas Mueller <amuel...@ais.uni-bonn.de> > wrote: > >> Hi Taylor. >> Thanks for wanting to contribute. >> I am a bit ambivalent wrt to adding SOMs. >> >> I have not seen or heard of an application where SOMs work better than >> any of the clustering or manifold-learning algorithms in sklearn. >> On the other hand, it is a classical algorithm and having a reference >> implementation would be kind of nice. Also they are in ESL. >> Do you have any application where they work better than other algorithms >> in sklearn? >> >> There was a PR of a SOM somewhere. It looks like it got closed because >> it was inactive for too long (~2 years?). >> >> What variant did you implement? I think we should stay at pretty vanilla >> and I guess a 2d grid for the neurons would suffice, >> as this is by far the most common variation. >> >> Cheers, >> Andy >> >> PS: I really really dislike SOMs. Don't let that discourage you ;) >> >> >> On 10/21/2013 10:41 PM, Taylor Sather wrote: >>> Hello, >>> >>> I was wondering if there was any effort to implement a Kohonen map in >>> scikit-learn? I'm thinking of getting my implementation up to snuff for a >>> pull request, but I wanted to ask the mailing list before I invested too >>> much effort. >>> >>> >>> Thanks, >>> Taylor Sather >>> ------------------------------------------------------------------------------ >>> October Webinars: Code for Performance >>> Free Intel webinars can help you accelerate application performance. >>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>> from >>> the latest Intel processors and coprocessors. See abstracts and register > >>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from >> the latest Intel processors and coprocessors. See abstracts and register > >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general