Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-31 Thread Artem Barger
On Tue, May 31, 2016 at 4:45 PM, Gilles wrote: > Sorry, hit wrong key... > > On Tue, 31 May 2016 15:41:21 +0200, Gilles wrote: > >> On Tue, 31 May 2016 15:28:54 +0300, Artem Barger wrote: >> >>> Hi, >>> >>> Just finished the updated of the sources to address all

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-31 Thread Gilles
Sorry, hit wrong key... On Tue, 31 May 2016 15:41:21 +0200, Gilles wrote: On Tue, 31 May 2016 15:28:54 +0300, Artem Barger wrote: Hi, Just finished the updated of the sources to address all comments from MATH-1371, attached updated sources. BTW, tried to commit feature branch directly to

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-31 Thread Gilles
On Tue, 31 May 2016 15:28:54 +0300, Artem Barger wrote: Hi, Just finished the updated of the sources to address all comments from MATH-1371, attached updated sources. BTW, tried to commit feature branch directly to the remote, looks like I need a user or write access in order to being able

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-31 Thread Artem Barger
Hi, Just finished the updated of the sources to address all comments from MATH-1371, attached updated sources. BTW, tried to commit feature branch directly to the remote, looks like I need a user or write access in order to being able to do it. Best regards, Artem Barger.

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Artem Barger
On Tue, May 31, 2016 at 3:31 AM, Gilles wrote: > On Tue, 31 May 2016 03:10:20 +0300, Artem Barger wrote: > >> >>> > ​Yes, you can parallelize it, though it will cancel several > optimizations > I've added. In fact you can partition the input

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Gilles
On Tue, 31 May 2016 03:10:20 +0300, Artem Barger wrote: ​Yes, you can parallelize it, though it will cancel several optimizations I've added. In fact you can partition the input according to number of threads you'd like to use and make each thread to take care of relevant data chunk. I

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Artem Barger
> >>> >>> ​Yes, you can parallelize it, though it will cancel several optimizations >> I've added. >> In fact you can partition the input according to number of threads you'd >> like to use >> and make each thread to take care of relevant data chunk. >> >> I guess it will increase performance, not

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Gilles
On Tue, 31 May 2016 02:42:03 +0300, Artem Barger wrote: On Tue, May 31, 2016 at 2:20 AM, Gilles wrote: On Tue, 31 May 2016 01:28:48 +0300, Artem Barger wrote: Hi, I've used out of the box current KMeansPlusPlusClusterer implementation provided by CM, however

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Artem Barger
On Tue, May 31, 2016 at 2:20 AM, Gilles wrote: > On Tue, 31 May 2016 01:28:48 +0300, Artem Barger wrote: > >> Hi, >> >> I've used out of the box current KMeansPlusPlusClusterer implementation >> provided by CM, however saw that it doesn't scales well on large data

Re: [Math] MATH-1371: Provide accelerated kmeans++ implementation

2016-05-30 Thread Gilles
On Tue, 31 May 2016 01:28:48 +0300, Artem Barger wrote: Hi, I've used out of the box current KMeansPlusPlusClusterer implementation provided by CM, however saw that it doesn't scales well on large data volumes. One of the proposals to improve current implementation was submitted in JIRA-1330