[ 
https://issues.apache.org/jira/browse/MATH-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331130#comment-17331130
 ] 

Paul King commented on MATH-1563:
---------------------------------

There has been talk in other threads on where this functionality might "best" 
belong, e.g. perhaps it belongs in Apache Spark. I thought I would add some 
stats from my usage of related algorithms. For some of my data science work, I 
use commons-math direct as well as Apache Spark and Apache Ignite. I see 
commons-math as a handy light-weight library whereas Spark and Ignite are more 
platforms. Sometimes when searching for e.g. a word in some text, it is 
appropriate to use perhaps a regex library or even String#contains. Other 
times, using a "platform" like lucene will be more appropriate. If my "text" is 
very large, lucene is going to scale in ways that String#contains won't. The 
same applies for data science algorithms.

For linear regression, taking just one example dataset, commons-math is a 
couple of library calls for a single 2M library and solves the problem in 
240ms. Both Ignite and Spark involve "firing up the platform" and the code is 
more complex for simple scenarios. Spark has a 181M footprint across 210 jars 
and solves the problem in about 20s. Ignite has a 87M footprint across 85 jars 
and solves the problem in > 40s. But I can also find more complex scenarios 
which need to scale where Ignite and Spark really come into their own.

So for me, the first obvious home would be commons math (exactly where is still 
a healthy debate). If it proves popular, we can consider how to support 
parallelism, e.g. threads or by that time Loom Fibers may be in more common 
use. And if it does turn out to be popular, Ignite and Spark could no doubt 
also consider how they might take it and do the extra work to make it efficient 
within their platforms.



> Implementation of Adaptive Probability Generation Strategy for Genetic 
> Algorithm
> --------------------------------------------------------------------------------
>
>                 Key: MATH-1563
>                 URL: https://issues.apache.org/jira/browse/MATH-1563
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: AVIJIT BASAK
>            Priority: Major
>
> In Genetic Algorithm probability of crossover and mutation operation can be 
> generated in an adaptive manner. Some experiment was done related to this and 
> published in this article 
> "https://www.ijcaonline.org/archives/volume175/number10/basak-2020-ijca-920572.pdf";.
> Currently Apache's API works on constant probability strategy. I would like 
> to propose incorporation of rank based adaptive probability generation 
> strategy as described in the mentioned article. This will improve the 
> performance and robustness of the algorithm and would make this more suitable 
> for use in higher dimensional problems like machine learning or deep learning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to