Regarding the MiniBatchKMeans, I use the following parameters
MiniBatchKMeans(n_clusters=nb_words, verbose=1, init='random', batch_size=10
* nb_words, compute_labels=False, reassignment_ratio=0.0, random_state=1,
n_init=3)
With 1000 words. I am not sure about the batch size as well as the
initialisation. Does 'k-means++' should improve the convergence with the
mini-batch?
On 8 March 2016 at 23:23, Guillaume Lemaître <g.lemaitr...@gmail.com> wrote:
> Sorry I was wrong. The MiniBatchKMeans converge after 20 minutes.
> So for one iteration of the CV, I get something like that:
>
> Classification performed
> [[21 2 0]
> [ 0 20 0]
> [ 0 0 23]]
> It took 1253.23589396 seconds.
>
> Probably this is not desirable to have a cross-validation. I don't know if
> you consider 20 minutes as reasonable?
>
> On 8 March 2016 at 22:09, Andreas Mueller <t3k...@gmail.com> wrote:
>
>> Hey Guillaume.
>> If it is a couple of hours, I'm not sure it is worth adding.
>> You can probably aggressively subsample or just do fewer iterations
>> (like, one pass over the data)
>> How do you run MiniBatchKMeans?
>>
>> Cheers,
>> Andy
>>
>>
>> On 03/08/2016 03:21 PM, Guillaume Lemaître wrote:
>>
>> Hi,
>>
>> I made a pull-request with the draft:
>> <https://github.com/scikit-learn/scikit-learn/pull/6509>
>> https://github.com/scikit-learn/scikit-learn/pull/6509
>> Extracting the feature is taking a honest amount of time (around 30 sec.)
>> The codebook generation through MiniBatchKMeans is more problematic. I am
>> still running it but it could be a couple of hours.
>>
>> Let me know what do you think about it,
>>
>> Cheers,
>>
>> On 24 February 2016 at 00:41, Andy <t3k...@gmail.com> wrote:
>>
>>> On 02/23/2016 04:32 PM, Guillaume Lemaitre wrote:
>>>
>>> Since that I was working on a cluster I did not realize but loading all
>>> the image in memory will be problematic with a laptop-desktop configuration.
>>>
>>> Or we can learn the PCA projection on a subset and to apply the
>>> dimension reduction right after the patch extraction. However, I am not
>>> sure that all data will fit in memory.
>>>
>>> We have out of core versions for PCA and KMeans.
>>>
>>> I think the way I'd do it is to go over all images, extract only a
>>> couple of patches from each image, store them.
>>> After we have some patches from all images, I'd learn the PCA model.
>>> Then we can go over the data again, transforming the patches. If they
>>> don't fit into memory after dimensionality reduction, we can
>>> use minibatch k-means to do the clustering without loading all the data.
>>> then we need to go over the data one more time to get the cluster
>>> centers and compute the BoW (which will fit in memory)
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance
>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
>>> Monitor end-to-end web transactions and take corrective actions now
>>> Troubleshoot faster and improve end-user experience. Signup Now!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> --
>>
>>
>>
>>
>> *LEMAÎTRE Guillaume PhD Candidate MSc Erasmus Mundus ViBOT
>> (Vision-roBOTic) MSc Business Innovation and Technology Management *
>> <g.lemaitr...@gmail.com>g.lemaitr...@gmail.com
>>
>> *ViCOROB - Computer Vision and Robotic Team*
>> Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
>> Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
>> http://vicorob.udg.es/
>>
>> *LE2I - Le Creusot *IUT Le Creusot, Laboratoire LE2I, 12 rue de la
>> Fonderie, 71200 Le Creusot
>> Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
>> http://le2i.cnrs.fr
>>
>> https://sites.google.com/site/glemaitre58/
>> Vice - Chairman of A.S.C. Fours UFOLEP
>> Chairman of A.S.C. Fours FFC
>> Webmaster of http://ascfours.free.fr
>>
>>
>> ------------------------------------------------------------------------------
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.http://makebettercode.com/inteldaal-eval
>>
>>
>>
>> _______________________________________________
>> Scikit-learn-general mailing
>> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.
>> http://makebettercode.com/inteldaal-eval
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> --
>
>
>
>
> *LEMAÎTRE GuillaumePhD CandidateMSc Erasmus Mundus ViBOT
> (Vision-roBOTic)MSc Business Innovation and Technology Management*
> g.lemaitr...@gmail.com
>
> *ViCOROB - Computer Vision and Robotic Team*
> Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
> Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
> http://vicorob.udg.es/
>
> *LE2I - Le Creusot*IUT Le Creusot, Laboratoire LE2I, 12 rue de la
> Fonderie, 71200 Le Creusot
> Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
> http://le2i.cnrs.fr
>
> https://sites.google.com/site/glemaitre58/
> Vice - Chairman of A.S.C. Fours UFOLEP
> Chairman of A.S.C. Fours FFC
> Webmaster of http://ascfours.free.fr
>
--
*LEMAÎTRE GuillaumePhD CandidateMSc Erasmus Mundus ViBOT
(Vision-roBOTic)MSc Business Innovation and Technology Management*
g.lemaitr...@gmail.com
*ViCOROB - Computer Vision and Robotic Team*
Universitat de Girona, Campus Montilivi, Edifici P-IV 17071 Girona
Tel. +34 972 41 98 12 - Fax. +34 972 41 82 59
http://vicorob.udg.es/
*LE2I - Le Creusot*IUT Le Creusot, Laboratoire LE2I, 12 rue de la Fonderie,
71200 Le Creusot
Tel. +33 3 85 73 10 90 - Fax. +33 3 85 73 10 97
http://le2i.cnrs.fr
https://sites.google.com/site/glemaitre58/
Vice - Chairman of A.S.C. Fours UFOLEP
Chairman of A.S.C. Fours FFC
Webmaster of http://ascfours.free.fr
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://makebettercode.com/inteldaal-eval
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general