Sorry I was wrong. The MiniBatchKMeans converge after 20 minutes.
So for one iteration of the CV, I get something like that:
Classification performed
[[21 2 0]
[ 0 20 0]
[ 0 0 23]]
It took 1253.23589396 seconds.
Probably this is not desirable to have a cross-validation. I don't know if
you consider 20 minutes as reasonable?
On 8 March 2016 at 22:09, Andreas Mueller <t3k...@gmail.com> wrote:
> Hey Guillaume.
> If it is a couple of hours, I'm not sure it is worth adding.
> You can probably aggressively subsample or just do fewer iterations (like,
> one pass over the data)
> How do you run MiniBatchKMeans?
>
> Cheers,
> Andy
>
>
> On 03/08/2016 03:21 PM, Guillaume Lemaître wrote:
>
> Hi,
>
> I made a pull-request with the draft:
> <https://github.com/scikit-learn/scikit-learn/pull/6509>
> https://github.com/scikit-learn/scikit-learn/pull/6509
> Extracting the feature is taking a honest amount of time (around 30 sec.)
> The codebook generation through MiniBatchKMeans is more problematic. I am
> still running it but it could be a couple of hours.
>
> Let me know what do you think about it,
>
> Cheers,
>
> On 24 February 2016 at 00:41, Andy <t3k...@gmail.com> wrote:
>
>> On 02/23/2016 04:32 PM, Guillaume Lemaitre wrote:
>>
>> Since that I was working on a cluster I did not realize but loading all
>> the image in memory will be problematic with a laptop-desktop configuration.
>>
>> Or we can learn the PCA projection on a subset and to apply the dimension
>> reduction right after the patch extraction. However, I am not sure that all
>> data will fit in memory.
>>
>> We have out of core versions for PCA and KMeans.
>>
>> I think the way I'd do it is to go over all images, extract only a couple
>> of patches from each image, store them.
>> After we have some patches from all images, I'd learn the PCA model.
>> Then we can go over the data again, transforming the patches. If they
>> don't fit into memory after dimensionality reduction, we can
>> use minibatch k-means to do the clustering without loading all the data.
>> then we need to go over the data one more time to get the cluster centers
>> and compute the BoW (which will fit in memory)
>>
>>
>> ------------------------------------------------------------------------------
>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>> Monitor end-to-end web transactions and take corrective actions now
>> Troubleshoot faster and improve end-user experience. Signup Now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
>
>
>
>
> *LEMAÎTRE Guillaume PhD Candidate MSc Erasmus Mundus ViBOT
> (Vision-roBOTic) MSc Business Innovation and Technology Management *
> <g.lemaitr...@gmail.com>g.lemaitr...@gmail.com
>
> *ViCOROB - Computer Vision and Robotic Team*
> Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
> Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
> http://vicorob.udg.es/
>
> *LE2I - Le Creusot *IUT Le Creusot, Laboratoire LE2I, 12 rue de la
> Fonderie, 71200 Le Creusot
> Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
> http://le2i.cnrs.fr
>
> https://sites.google.com/site/glemaitre58/
> Vice - Chairman of A.S.C. Fours UFOLEP
> Chairman of A.S.C. Fours FFC
> Webmaster of http://ascfours.free.fr
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.http://makebettercode.com/inteldaal-eval
>
>
>
> _______________________________________________
> Scikit-learn-general mailing
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://makebettercode.com/inteldaal-eval
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
--
*LEMAÎTRE GuillaumePhD CandidateMSc Erasmus Mundus ViBOT
(Vision-roBOTic)MSc Business Innovation and Technology Management*
g.lemaitr...@gmail.com
*ViCOROB - Computer Vision and Robotic Team*
Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
http://vicorob.udg.es/
*LE2I - Le Creusot*IUT Le Creusot, Laboratoire LE2I, 12 rue de la Fonderie,
71200 Le Creusot
Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
http://le2i.cnrs.fr
https://sites.google.com/site/glemaitre58/
Vice - Chairman of A.S.C. Fours UFOLEP
Chairman of A.S.C. Fours FFC
Webmaster of http://ascfours.free.fr
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://makebettercode.com/inteldaal-eval
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general