Thank you Matthieu and Olivier for your help.
It sounds like, based on what Olivier said, that a good approach would be
for me to implement the algorithms in a way that ensures compatibility with
scikit-learn and then submit them for consideration for inclusion once they
are fully completed.
Matthieu, my original reasoning behind implementing Sammon mapping was that
it seemed to be relatively intuitive to understand to the point where
people using the software could modify it and devise their own metrics to
suit the needs of their research. Having gone back today and given it a
second look, it does seem rather crude as you mentioned, so maybe it would
be best if I hold off on that.
As far as Principle Curve Analysis, I liked that it was well established (I
believe it was one of the earlier manifold learning algorithms introduced)
and that it intuitively generalizes Principle Component Analysis to account
for nonlinearity. Since Principle Component Analysis is widely used and
implemented, Principle Curve Analysis seemed like a natural algorithm to
include for nonlinear cases.
I was looking at some of the other Manifold Learning algorithms currently
in use, and it appears that Topologically Constrained Isometric Embedding
offers improvements over many of the algorithms currently in scikit-learn,
such as Isomap, LLE, and Eigenmapping. In particular, it seems to perform
more robustly in response to noisy and non-convex data. This paper offers a
nice comparison between TCIE and the existing algorithms.
http://people.csail.mit.edu/rosman/tcie_ijcv.pdf
I would certainly be interested in implementing TCIE if there's any
interest in having it included in scikit-learn.
Cheers,
Daniel
On Mon, Apr 25, 2016 at 9:11 AM, Matthieu Brucher <
matthieu.bruc...@gmail.com> wrote:
> Hi Daniel,
>
> I think in the original scikit pull request on my PhD thesis almost 10
> years ago, there may have been some Sammon mapping code. IIRC, the mapping
> is really crude and not robust. I think there are other cost functions for
> dimensionality reduction that are far more efficient and do not have the
> same drawbacks than Sammon mapping.
> I don't remember my position on PCA, I know that I had a look at it but
> never implemented it. What is the purpose of implementing this one in
> particular?
>
> Cheers,
>
> Matthieu
>
> 2016-04-25 7:59 GMT+02:00 Daniel McNeela <daniel.mcne...@gmail.com>:
>
>> Hi All,
>>
>> My name is Daniel McNeela, and I am a student at UC Berkeley
>> participating in Google Summer of Code 2016. I am working on the Fovea
>> project under the umbrella of the International Neuroinformatics
>> Coordinating Facility. The abstract for my project can be found here:
>> https://summerofcode.withgoogle.com/projects/#5940697098092544
>>
>> To be brief, Fovea is a Python tool for visualizing dynamical systems and
>> associated data, and an integral part of the back end for the software
>> involves performing both linear and nonlinear dimensionality reduction on
>> data sets. My project mentor would like to add scikit-learn as a dependency
>> since it already has a number of manifold learning algorithms implemented.
>> However, I am planning on using two additional algorithms that are not
>> currently implemented in scikit-learn, namely Sammon Mapping and Principal
>> Curve Analysis, and I was wondering whether the developer team would be
>> interested in incorporating these two algorithms into scikit-learn's
>> existing Manifold Learning package.
>>
>> Please let me know your thoughts. Information regarding these two
>> algorithms can be found at the following links:
>>
>>
>> http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/AV0910/henderson.pdf
>>
>> https://web.stanford.edu/~hastie/Papers/Principal_Curves.pdf
>>
>> <https://web.stanford.edu/~hastie/Papers/Principal_Curves.pdf>
>> Thanks for your time, and looking forward to hearing from you!
>>
>>
>> - Daniel
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Find and fix application performance issues faster with Applications
>> Manager
>> Applications Manager provides deep performance insights into multiple
>> tiers of
>> your business applications. It resolves application problems quickly and
>> reduces your MTTR. Get your free trial!
>> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
> Information System Engineer, Ph.D.
> Blog: http://matt.eifelle.com
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
>
>
> ------------------------------------------------------------------------------
> Find and fix application performance issues faster with Applications
> Manager
> Applications Manager provides deep performance insights into multiple
> tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general