subject:"\[jira\] \[Work logged\] \(MATH\-1509\) Implement the MiniBatchKMeansClusterer"

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=410121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410121
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 26/Mar/20 07:34
Start Date: 26/Mar/20 07:34
Worklog Time Spent: 10m 
  Work Description: chentao106 commented on pull request #129: #MATH-1509: 
Add missing documentation to class MiniBatchKMeansCluster…
URL: https://github.com/apache/commons-math/pull/129
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410121)
Time Spent: 1h 40m  (was: 1.5h)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=410120&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410120
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 26/Mar/20 07:34
Start Date: 26/Mar/20 07:34
Worklog Time Spent: 10m 
  Work Description: chentao106 commented on issue #129: #MATH-1509: Add 
missing documentation to class MiniBatchKMeansCluster…
URL: https://github.com/apache/commons-math/pull/129#issuecomment-604275377
 
 
   Replace by another PR
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 410120)
Time Spent: 1.5h  (was: 1h 20m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=409613&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409613
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 25/Mar/20 16:11
Start Date: 25/Mar/20 16:11
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #132: MATH-1509: Add 
missing documentation to class ImprovementEvaluator
URL: https://github.com/apache/commons-math/pull/132
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 409613)
Time Spent: 1h 20m  (was: 1h 10m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=409577&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409577
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 25/Mar/20 15:24
Start Date: 25/Mar/20 15:24
Worklog Time Spent: 10m 
  Work Description: coveralls commented on issue #132: MATH-1509: Add 
missing documentation to class ImprovementEvaluator
URL: https://github.com/apache/commons-math/pull/132#issuecomment-603903887
 
 
   
   [![Coverage 
Status](https://coveralls.io/builds/29609723/badge)](https://coveralls.io/builds/29609723)
   
   Coverage increased (+0.005%) to 90.553% when pulling 
**01227337f8d6645550a9559bef1a57297feab7b6 on 
chentao106:ImprovementEvaluatorDoc** into 
**6b0395898e9469fda20f011ded8dce3f9d0df907 on apache:master**.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 409577)
Time Spent: 1h 10m  (was: 1h)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-25 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=409548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-409548
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 25/Mar/20 14:47
Start Date: 25/Mar/20 14:47
Worklog Time Spent: 10m 
  Work Description: chentao106 commented on pull request #132: MATH-1509: 
Add missing documentation to class ImprovementEvaluator
URL: https://github.com/apache/commons-math/pull/132
 
 
   Add missing documentation to class ImprovementEvaluator
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 409548)
Time Spent: 1h  (was: 50m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=408538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408538
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 24/Mar/20 04:28
Start Date: 24/Mar/20 04:28
Worklog Time Spent: 10m 
  Work Description: coveralls commented on issue #129: #MATH-1509: Add 
missing documentation to class MiniBatchKMeansCluster…
URL: https://github.com/apache/commons-math/pull/129#issuecomment-603007764
 
 
   
   [![Coverage 
Status](https://coveralls.io/builds/29569884/badge)](https://coveralls.io/builds/29569884)
   
   Coverage increased (+0.008%) to 90.556% when pulling 
**e5fb5e16a25fd408f673eeb5c257c8bdce715f84 on 
chentao106:MiniBatchImprovementEvaluator** into 
**6b0395898e9469fda20f011ded8dce3f9d0df907 on apache:master**.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 408538)
Time Spent: 50m  (was: 40m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-23 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=408529&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-408529
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 24/Mar/20 04:18
Start Date: 24/Mar/20 04:18
Worklog Time Spent: 10m 
  Work Description: chentao106 commented on pull request #129: #MATH-1509: 
Add missing documentation to class MiniBatchKMeansCluster…
URL: https://github.com/apache/commons-math/pull/129
 
 
   Add missing documentation to class 
MiniBatchKMeansCluster.ImprovementEvaluator.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 408529)
Time Spent: 40m  (was: 0.5h)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-22 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=407613&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407613
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 22/Mar/20 15:11
Start Date: 22/Mar/20 15:11
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #128: #MATH-1509: 
Implement the MiniBatchKMeansClusterer.
URL: https://github.com/apache/commons-math/pull/128
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407613)
Time Spent: 0.5h  (was: 20m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=407536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407536
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 22/Mar/20 04:19
Start Date: 22/Mar/20 04:19
Worklog Time Spent: 10m 
  Work Description: coveralls commented on issue #128: #MATH-1509: 
Implement the MiniBatchKMeansClusterer.
URL: https://github.com/apache/commons-math/pull/128#issuecomment-602145882
 
 
   
   [![Coverage 
Status](https://coveralls.io/builds/29527552/badge)](https://coveralls.io/builds/29527552)
   
   Coverage increased (+0.04%) to 90.559% when pulling 
**cd7df89611d0d6e60f0133bb155894e391c8b3f8 on 
chentao106:feature-minibatchkmeans++** into 
**22373aeb76811aae77f581143e9fed34580316eb on apache:master**.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407536)
Time Spent: 20m  (was: 10m)

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

2020-03-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/MATH-1509?focusedWorklogId=407535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407535
 ]

ASF GitHub Bot logged work on MATH-1509:


Author: ASF GitHub Bot
Created on: 22/Mar/20 04:09
Start Date: 22/Mar/20 04:09
Worklog Time Spent: 10m 
  Work Description: chentao106 commented on pull request #128: #MATH-1509: 
Implement the MiniBatchKMeansClusterer.
URL: https://github.com/apache/commons-math/pull/128
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407535)
Remaining Estimate: 0h
Time Spent: 10m

> Implement the MiniBatchKMeansClusterer
> --
>
> Key: MATH-1509
> URL: https://issues.apache.org/jira/browse/MATH-1509
> Project: Commons Math
>  Issue Type: New Feature
>Reporter: Chen Tao
>Priority: Major
> Attachments: compare.png, intensive-data-comparsion-badcase.png, 
> intensive-data-comparsion.png, random-data-comparison.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> MiniBatchKMeans is a fast clustering algorithm, 
> which use partial points in initialize cluster centers, and mini batch in 
> training iterations.
>  It can finish in few seconds on clustering millions of data, and has few 
> differences between KMeans.
> I have implemented it by Kotlin in my own project, and I'd like to contribute 
> the code  to Apache Commons Math, of course in java.
> My implemention is base on Apache Commons Math3, refer to Python 
> sklearn.cluster.MiniBatchKMeans
> Thought test I found it works well on intensive data, significant performance 
> improvement and return value has few difference to KMeans++, but has many 
> difference on sparse data.
>  
> Below is the comparation of my implemention and KMeansPlusPlusClusterer
>   !compare.png!
>  
> I have created a pull request on 
> [https://github.com/apache/commons-math/pull/117], for reference only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

[jira] [Work logged] (MATH-1509) Implement the MiniBatchKMeansClusterer

10 matches

Site Navigation

Mail list logo

Footer information