Hi everyone, I thought I would start a discussion about this on the mailing
list: I would like to get Online or Incremental Principal Component
Analysis (IPCA) into scikit-learn. Incremental methods are advantageous
because they can be used to update a previously learned model and they can
handle larger data sets.

I've been working with IPCA for my masters and I formatted some of the code
for submission to scikit-learn but there needs to be a discussion about
which exact algorithm is best for inclusion in scikit-learn.

In this pull request https://github.com/scikit-learn/scikit-learn/pull/1885 I
submitted the algorithm from  "Incremental Eigenanalysis for
Classification" P.Hall, D. Marshall and R. Martin 1998. and the algorithm
from "Candid Covariance-Free Incremental Principal Component Analysis" J.
Weng, Y. Zhang, W. Hwang, 2003. I chose these two because they were the
ones referenced in the computer vision papers I'm working with
although admittedly this may have not been the best way to go about it.
Some other papers were mentioned in the comments on the pull request which
I've only been able to briefly gloss through. It looks to me as though the
algorithm by Hall et al. is sort of the base line IPCA method although Hall
et al. may not be the original citation. I've read at least a few other
papers that use the same update technique as Hall. CCIPCA however is
perhaps not a good candidate for inclusion as there are probably newer and
better techniques.

What do people think what Incremental PCA technique(s) should be in
scikit-learn?

Cheers
Kevin
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to