GoDec might not have the citations (yet) to be added to scikit-learn.
But I think a basic ALM based RPCA would be a great addition, along
with a cool demo. Background smart background subtraction would be my
vote but might be too heavy weight - I could see a cool example of
something like colored bouncing balls overlaid on the china picture
that is built in for sklearn.


On Thu, Apr 16, 2015 at 1:18 PM, Alex Papanicolaou
<alex.papa...@gmail.com> wrote:
> How about something like this:
> 1.  Basic implementation of ALM uses arpack (not ideal but it means sklearn
> can have RPCA available)
> 2.  Option to use randomized SVD if desired
> 3.  Option to use propack if desired and it's available (or if/when scipy
> begins to use it)
> 4.  GoDec implementation for low rank + sparse + noise
>
>
>
>
> On Wed, Apr 15, 2015 at 4:06 PM,
> <scikit-learn-general-requ...@lists.sourceforge.net> wrote:
>>
>> Send Scikit-learn-general mailing list submissions to
>>         scikit-learn-general@lists.sourceforge.net
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> or, via email, send a message with subject or body 'help' to
>>         scikit-learn-general-requ...@lists.sourceforge.net
>>
>> You can reach the person managing the list at
>>         scikit-learn-general-ow...@lists.sourceforge.net
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Scikit-learn-general digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: Scikit-learn-general Digest, Vol 63,  Issue 34
>>       (Alex Papanicolaou)
>>    2. Re: Robust PCA (Olivier Grisel)
>>    3. Re: Robust PCA (Kyle Kastner)
>>    4. Re: Robust PCA (Yogesh Karpate)
>>    5. Re: Performance of LSHForest (Joel Nothman)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Wed, 15 Apr 2015 11:22:17 -0700
>> From: Alex Papanicolaou <alex.papa...@gmail.com>
>> Subject: Re: [Scikit-learn-general] Scikit-learn-general Digest, Vol
>>         63,     Issue 34
>> To: scikit-learn-general@lists.sourceforge.net
>> Message-ID:
>>
>> <CAGNPn4qTmTXOgpLX=ziqapuv5b29iecvfrfwpo96rnectww...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> Kyle & Andreas,
>>
>> Here is my github repo:
>> https://github.com/apapanico/RPCA
>>
>> Responses:
>> 1. I didn't make the GSoC suggestion a few years (also not a student
>> anymore :-(, just using RPCA for work), I just came across it in a google
>> search when trying to find python implementations.
>> 2. As for GoDec, I have not poked around with it but I would like to.  I
>> had intended to use this as a starting point:
>> https://sites.google.com/site/godecomposition/home
>> But yea, it sounds like it can go much bigger.   But if I'm not mistaken,
>> it's technically a different problem (low rank + sparse + noise).
>> 3. Regarding PROPACK, the main routine needed is lansvd which implements
>> Lanczos bidiagonalization with partial reorthogonalization.  I do not know
>> what else that depends on.  I also do not know if there's an
>> implementation
>> in C which would be preferred, obviously.  A routine for computing only
>> top-k singular triplets is pretty key for making Candes' ALM method as
>> efficient as possible.  Along these lines, I started out using the
>> randomized SVD from sklearn but I was failing my tests generated with the
>> original Matlab code so I switched to numpy svd and then finally svdp in
>> pypropack.
>>
>> Cheers,
>> Alex
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Wed, 15 Apr 2015 15:40:33 -0400
>> From: Olivier Grisel <olivier.gri...@ensta.org>
>> Subject: Re: [Scikit-learn-general] Robust PCA
>> To: scikit-learn-general <scikit-learn-general@lists.sourceforge.net>
>> Message-ID:
>>
>> <CAFvE7K60pn7-YP7rfreFU932on8omr7=q8-Vxsf0a+=v_nt...@mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> We could use PyPROPACK if it was contributed upstream in scipy ;)
>>
>> I know that some scipy maintainers don't appreciate arpack much and
>> would like to see it replaced (or at least completed with propack).
>>
>> --
>> Olivier
>>
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Wed, 15 Apr 2015 15:51:01 -0400
>> From: Kyle Kastner <kastnerk...@gmail.com>
>> Subject: Re: [Scikit-learn-general] Robust PCA
>> To: scikit-learn-general@lists.sourceforge.net
>> Message-ID:
>>
>> <CAGNZ19AqUxUV3So_pQ2vn=hDQzMkD4Wgodm6uwTUWAZbomx=_...@mail.gmail.com>
>> Content-Type: text/plain; charset=UTF-8
>>
>> IF it was in scipy would it be backported to the older versions? How
>> would we handle that?
>>
>> On Wed, Apr 15, 2015 at 3:40 PM, Olivier Grisel
>> <olivier.gri...@ensta.org> wrote:
>> > We could use PyPROPACK if it was contributed upstream in scipy ;)
>> >
>> > I know that some scipy maintainers don't appreciate arpack much and
>> > would like to see it replaced (or at least completed with propack).
>> >
>> > --
>> > Olivier
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
>> > Develop your own process in accordance with the BPMN 2 standard
>> > Learn Process modeling best practices with Bonita BPM through live
>> > exercises
>> > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
>> > event?utm_
>> > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Wed, 15 Apr 2015 22:02:40 +0200
>> From: Yogesh Karpate <yogeshkarp...@gmail.com>
>> Subject: Re: [Scikit-learn-general] Robust PCA
>> To: scikit-learn-general@lists.sourceforge.net
>> Message-ID:
>>
>> <CAG7mFDvXJF9gKF3LBuAk=unzibj5sxpyksiz+iueusdrkg0...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>>
>> Couple of months back, I tried to use following
>> https://github.com/shriphani/robust_pcp/blob/master/robust_pcp.py
>> But I could not install pypropack develope by Jake Vanderplas
>> So I used randomized_svd from Scikitlearn instead of svdp in the code
>> mentioned above.
>> It worked "OK" for me.
>>
>>
>> On Wed, Apr 15, 2015 at 9:51 PM, Kyle Kastner <kastnerk...@gmail.com>
>> wrote:
>>
>> > IF it was in scipy would it be backported to the older versions? How
>> > would we handle that?
>> >
>> > On Wed, Apr 15, 2015 at 3:40 PM, Olivier Grisel
>> > <olivier.gri...@ensta.org> wrote:
>> > > We could use PyPROPACK if it was contributed upstream in scipy ;)
>> > >
>> > > I know that some scipy maintainers don't appreciate arpack much and
>> > > would like to see it replaced (or at least completed with propack).
>> > >
>> > > --
>> > > Olivier
>> > >
>> > >
>> >
>> > ------------------------------------------------------------------------------
>> > > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
>> > > Develop your own process in accordance with the BPMN 2 standard
>> > > Learn Process modeling best practices with Bonita BPM through live
>> > exercises
>> > > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
>> > event?utm_
>> > > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
>> > > _______________________________________________
>> > > Scikit-learn-general mailing list
>> > > Scikit-learn-general@lists.sourceforge.net
>> > > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
>> > Develop your own process in accordance with the BPMN 2 standard
>> > Learn Process modeling best practices with Bonita BPM through live
>> > exercises
>> > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
>> > event?utm_
>> > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >
>>
>>
>>
>> --
>>     Warm Regards
>>     Yogesh Karpate
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>>
>> ------------------------------
>>
>> Message: 5
>> Date: Thu, 16 Apr 2015 09:06:51 +1000
>> From: Joel Nothman <joel.noth...@gmail.com>
>> Subject: Re: [Scikit-learn-general] Performance of LSHForest
>> To: scikit-learn-general <scikit-learn-general@lists.sourceforge.net>
>> Message-ID:
>>
>> <caakaflvyw6ol2ebm0dsh6f3o-mdb80kbnmeurnt+5seftz7...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> I agree this is disappointing, and we need to work on making LSHForest
>> faster. Portions should probably be coded in Cython, for instance, as the
>> current implementation is a bit circuitous in order to work in numpy. PRs
>> are welcome.
>>
>> LSHForest could use parallelism to be faster, but so can (and will) the
>> exact neighbors methods. In theory in LSHForest, each "tree" could be
>> stored on entirely different machines, providing memory benefits, but
>> scikit-learn can't really take advantage of this.
>>
>> Having said that, I would also try with higher n_features and n_queries.
>> We
>> have to limit the scale of our examples in order to limit the overall
>> document compilation time.
>>
>> On 16 April 2015 at 01:12, Miroslav Batchkarov <mbatchka...@gmail.com>
>>
>> wrote:
>>
>> > Hi everyone,
>> >
>> > was really impressed by the speedups provided by LSHForest compared to
>> > brute-force search. Out of curiosity, I compared LSRForest to the
>> > existing
>> > ball tree implementation. The approximate algorithm is consistently
>> > slower
>> > (see below). Is this normal and should it be mentioned in the
>> > documentation? Does approximate search offer any benefits in terms of
>> > memory usage?
>> >
>> >
>> > I ran the same example
>> >
>> > <http://scikit-learn.org/stable/auto_examples/neighbors/plot_approximate_nearest_neighbors_scalability.html#example-neighbors-plot-approximate-nearest-neighbors-scalability-py>
>> > with
>> > a algorithm=ball_tree. I also had to set metric=?euclidean? (this may
>> > affect results). The output is:
>> >
>> > Index size: 1000, exact: 0.000s, LSHF: 0.007s, speedup: 0.0, accuracy:
>> > 1.00 +/-0.00
>> > Index size: 2511, exact: 0.001s, LSHF: 0.007s, speedup: 0.1, accuracy:
>> > 0.94 +/-0.05
>> > Index size: 6309, exact: 0.001s, LSHF: 0.008s, speedup: 0.2, accuracy:
>> > 0.92 +/-0.07
>> > Index size: 15848, exact: 0.002s, LSHF: 0.008s, speedup: 0.3, accuracy:
>> > 0.92 +/-0.07
>> > Index size: 39810, exact: 0.005s, LSHF: 0.010s, speedup: 0.5, accuracy:
>> > 0.84 +/-0.10
>> > Index size: 100000, exact: 0.008s, LSHF: 0.016s, speedup: 0.5, accuracy:
>> > 0.80 +/-0.06
>> >
>> > With n_candidates=100, the output is
>> >
>> > Index size: 1000, exact: 0.000s, LSHF: 0.006s, speedup: 0.0, accuracy:
>> > 1.00 +/-0.00
>> > Index size: 2511, exact: 0.001s, LSHF: 0.006s, speedup: 0.1, accuracy:
>> > 0.94 +/-0.05
>> > Index size: 6309, exact: 0.001s, LSHF: 0.005s, speedup: 0.2, accuracy:
>> > 0.92 +/-0.07
>> > Index size: 15848, exact: 0.002s, LSHF: 0.007s, speedup: 0.4, accuracy:
>> > 0.90 +/-0.11
>> > Index size: 39810, exact: 0.005s, LSHF: 0.008s, speedup: 0.7, accuracy:
>> > 0.82 +/-0.13
>> > Index size: 100000, exact: 0.007s, LSHF: 0.013s, speedup: 0.6, accuracy:
>> > 0.78 +/-0.04
>> >
>> >
>> >
>> > ---
>> > Miroslav Batchkarov
>> > PhD Student,
>> > Text Analysis Group,
>> > Department of Informatics,
>> > University of Sussex
>> >
>> >
>> >
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
>> > Develop your own process in accordance with the BPMN 2 standard
>> > Learn Process modeling best practices with Bonita BPM through live
>> > exercises
>> > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
>> > event?utm_
>> > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >
>> >
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>>
>> ------------------------------
>>
>>
>> ------------------------------------------------------------------------------
>> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
>> Develop your own process in accordance with the BPMN 2 standard
>> Learn Process modeling best practices with Bonita BPM through live
>> exercises
>> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual-
>> event?utm_
>> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
>>
>> ------------------------------
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>> End of Scikit-learn-general Digest, Vol 63, Issue 35
>> ****************************************************
>
>
>
> ------------------------------------------------------------------------------
> BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
> Develop your own process in accordance with the BPMN 2 standard
> Learn Process modeling best practices with Bonita BPM through live exercises
> http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
> source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to