Re: [Scikit-learn-general] parallel GMM

2014-07-02 Thread Sturla Molden
On 03/07/14 04:33, Sturla Molden wrote: > I would also like to say (while we're at it) that parallelizing this > outside BLAS and LAPACK, whether with threads or processes, will require > a memory overhead roughly equal to the size of the data array per thread > or process. That is because computa

Re: [Scikit-learn-general] parallel GMM

2014-07-02 Thread Sturla Molden
On 03/07/14 04:01, Sturla Molden wrote: > On 02/07/14 06:02, Valerio Maggio wrote: > >> You were right when you said that under the hood the main `for` loop >> iterates over the number of components, but in scikit this is >> not done *explicitly* via Python loops. > > Look at lines 596 and 692. >

Re: [Scikit-learn-general] parallel GMM

2014-07-02 Thread Sturla Molden
On 02/07/14 06:02, Valerio Maggio wrote: > You were right when you said that under the hood the main `for` loop iterates > over the number of components, but in scikit this is > not done *explicitly* via Python loops. Look at lines 596 and 692. https://github.com/scikit-learn/scikit-learn/blob/

Re: [Scikit-learn-general] parallel GMM

2014-07-01 Thread Valerio Maggio
Hi Sturla and Yuan. Yesterday I looked into this and I would like to share with you my two cents. Yuan Luo wrote: > Hi, > Does anyone know how I can make GMM parallel the fitting of some moderately > big matrix (say, 390,000 x 400) with 200 components? Actually, with scikit you can't do this ou

Re: [Scikit-learn-general] parallel GMM

2014-07-01 Thread Sturla Molden
Yuan Luo wrote: > Hi, > Does anyone know how I can make GMM parallel the fitting of some moderately > big matrix (say, 390,000 x 400) with 200 components? I am not sure about GMM code in scikit-learn, but the EM-algorithm for GMMs is very easy to vectorize. There are several ways to do this: 1.

[Scikit-learn-general] parallel GMM

2014-07-01 Thread Yuan Luo
Hi, Does anyone know how I can make GMM parallel the fitting of some moderately big matrix (say, 390,000 x 400) with 200 components? Best, Yuan -- Open source business process management suite built on Java and Eclipse Tur