Hi All The newly proposed design of "commons-math4-ga2" has two primary issues which I would like to mention here.
*1) GA logic*: The design does not conform to the basic genetic algorithm concepts proposed by John Holland. The pseudocode representing the original algorithm logic is provided below: --CUT-- while(!converged(population)) { Population newPopulation = new Population(); for(int i = 0; i < size(population)/2; i++) { // select parents ChromosomePair parents = select(population); // do crossover ChromosomePair offsprings = crossover(parents); //do mutation Chromosome chromosome1 = mutate(offsprings[0]); Chromosome chromosome2 = mutate(offsprings[1]); // Add mutated chromosomes to population newPopulation.add(chromosome1); newPopulation.add(chromosome2); } } --CUT-- However, the implementation proposed in "commons-math4-ga2" can be represented by the pseudocode provided below. --CUT-- while(!converged(population)) { List<GeneticOperator> operators; Population newPopulation = new Population(); for(int i = 0; i < size(population)/2; i++) { for(GeneticOperator operator : operators) { // select parents ChromosomePair parents = select(population); // apply operator ChromosomePair offsprings = operator.apply(parents); // Add chromosomes to population newPopulation.add( offsprings[0] ); newPopulation.add( offsprings[1] ); } } } --CUT-- N.B. The use of probability and elitism has been avoided to keep the logic simplified. The first one has been used by the engineering community for decades and is proved to be effective. There is also a mathematical model based on schema theorem( https://en.wikipedia.org/wiki/Holland%27s_schema_theorem#:~:text=The%20Schema%20Theorem%20says%20that,the%20power%20of%20genetic%20algorithms.) to support the effectiveness of the algorithm. Same has been followed by me for implementation of "commons-math4-ga" module. However, the new design proposed as part of "commons-math4-ga2" deviates from the basic logic. It does not distinguish the operators i.e. crossover and mutation and treats them uniformly. The order of operator application is also not considered. Along with that it executes parent selection two times instead of one. These are clear deviations from the standard approach used so far and would require a fix. *2) Determination of mutation probability*: The newly proposed design of "commons-math4-ga2" determines the probability of mutation at the algorithm level. Same approach was used in math 3.x implementation. However, this approach considers the probability of mutation at the chromosome level not at the allele/gene level. I have found a considerable difference in the quality of optimization between two cases. Determining the mutation probability at the gene/allele level has given a considerably better result. Usage of mutation probability at the chromosome level would only ensure mutation of a single allele irrespective of probability or chromosome size. There is no such limitation in case the mutation probability is decided at the allele level and can be easily controlled by users for fine tuning. This has helped to improve the optimization quality thus providing better results. This is only related to mutation not crossover. But we can maintain an uniform approach and let the operator decide on the probability. Please share further thoughts. Thanks & Regards -- Avijit Basak