Re: [MATH][GA] Issues in "commons-math4-ga2" design

2022-10-15 Thread Avijit Basak
Hi All

Please see my comments below. Kindly share further thoughts.

> [...]
>I'm not sure what you mean: The examples just run a GA-like algorithm,
>but (AFAICT) do not compare the output to some expected outcome.
-- I have some code changes in the "examples-ga-math-functions" module to
compare results of two modules "commons-math4-ga" and "commons-math4-ga2".
A graphical approach using JFreeChart has been adopted for the same. A new
value "COMPARE" has been introduced for the "--api" input argument to
initiate the comparison.
The "commons-math4-ga" module consistently provided better results than
"commons-math4-ga2".
The code is kept in my repo
https://github.com/avijit-basak/commons-math/tree/feature/MATH-1563_comparison
.
I did not raise a PR till now. This is only kept in my repo for comparison.
Could you please check if the feature__MATH-1563__genetic_algorithm branch
does contain changes from master of apache repo.

>> This variant design is more appropriate for a *generalized population
based
>> stochastic optimizer* which can accommodate other algorithms like
>> multi-agent gradient descent/simulated annealing, genetic
algorithm(already
>> implemented), particle swarm optimization and large neighbourhood search
>> etc.
>> If we want to stick to this new design I would rather suggest *renaming*
of
>> the existing interfaces so that the API can be more generic and can be
used
>> for all other algorithms. GA should be a specific implementation for that
>> API.
>> However, we might have to think more on the multiple operator scenarios.
>
>An interesting suggestion.  If the generalized API can be achieved
>easily, I'm all for it.
>However, I wonder how useful it will be, as every actual optimizer
>implementation may
> * require substantial adaptations to fit the common API
> * need extensions to provide access to specific features (which
>   would decrease the usefulness of the common API for users).
[...]
-- We can avoid that for now as that will be a bigger task.

>
>[1] My main argument for the "GA variant" is that it is much simpler, for
> what seems equivalent functionality (bugs, or misinterpretation of
> expected behaviour, notwithstanding): Current counts of lines of
> code is 696 vs 2038.
-- The variant only contains options for binary genotype but the
"commons-math4-ga" module provides options for other genotypes too. So, we
may not compare the lines of code. However, considering the optimization
result and options of genotypes I would still vote for "commons-math4-ga"
instead of its new variant.

Thanks & Regards
--Avijit Basak

On Thu, 29 Sept 2022 at 22:42, Gilles Sadowski  wrote:

> Hello.
>
> Le jeu. 29 sept. 2022 à 14:07, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  Please find my comments below:
> >
> > >
> > >> Hi All
> > >>
> > >>  The newly proposed design of "commons-math4-ga2" has two
> primary
> > >> issues which I would like to mention here.
> > >>
> > >> *1) GA logic*: The design does not conform to the basic genetic
> algorithm
> > >I understand the concern about providing the standard ("historical") GA.
> > >The theorem assumes the standard GA, but the example shows that
> > >convergence is also achieved with the variant.
> >
> > -- Yes the new variant can accommodate the standard GA too.
> >
> > >
> > >> However, the new design proposed as part of "commons-math4-ga2"
> > >> deviates from the basic logic. It does not distinguish the operators
> i.e.
> > >> crossover and mutation and treats them uniformly. The order of
> > >> operator application is also not considered.
> > >
> > >All intended as "features". ;-)
> > >[One being that, in the variant implementation, it is possible to apply
> > >any number of operators, not just one specific crossover followed by
> > >one mutation.]
> > >
> > >Shouldn't we be able (IIUC) to define the standard GA procedure by
> > >an extension of the API like the following (untested):
> > >---CUT---
> > >public class CrossoverThenMutate
> > >extends AbstractCrossover {
> > >private AbstractCrossover c;
> > >private AbstractMutation m;
> > > [...]
> > >private List mutate(G parent,
> > >  UniformRandomProvider rng) {
> > >final List p = new ArrayList(1);
> > >p.add(parent);
> > >  

Re: [MATH][GA] Issues in "commons-math4-ga2" design

2022-09-29 Thread Avijit Basak
Hi All

 Please find my comments below:

>
>> Hi All
>>
>>  The newly proposed design of "commons-math4-ga2" has two primary
>> issues which I would like to mention here.
>>
>> *1) GA logic*: The design does not conform to the basic genetic algorithm
>I understand the concern about providing the standard ("historical") GA.
>The theorem assumes the standard GA, but the example shows that
>convergence is also achieved with the variant.

-- Yes the new variant can accommodate the standard GA too.

>
>> However, the new design proposed as part of "commons-math4-ga2"
>> deviates from the basic logic. It does not distinguish the operators i.e.
>> crossover and mutation and treats them uniformly. The order of
>> operator application is also not considered.
>
>All intended as "features". ;-)
>[One being that, in the variant implementation, it is possible to apply
>any number of operators, not just one specific crossover followed by
>one mutation.]
>
>Shouldn't we be able (IIUC) to define the standard GA procedure by
>an extension of the API like the following (untested):
>---CUT---
>public class CrossoverThenMutate
>extends AbstractCrossover {
>private AbstractCrossover c;
>private AbstractMutation m;
> [...]
>private List mutate(G parent,
>  UniformRandomProvider rng) {
>final List p = new ArrayList(1);
>p.add(parent);
>return m.apply(p, rng);
>}
>}
>---CUT---
>
>AFAICT, a standard GA would thus be performed if this combined
>operator would be used as a unique operator in the GA variant.

--If we consider this approach we may need to modify our examples which
assume the standard GA.
This variant design is more appropriate for a *generalized population based
stochastic optimizer* which can accommodate other algorithms like
multi-agent gradient descent/simulated annealing, genetic algorithm(already
implemented), particle swarm optimization and large neighbourhood search
etc.
If we want to stick to this new design I would rather suggest *renaming* of
the existing interfaces so that the API can be more generic and can be used
for all other algorithms. GA should be a specific implementation for that
API.
However, we might have to think more on the multiple operator scenarios.

>
>> Along with that it executes
>> parent selection two times instead of one.
>
>That would also be taken care of with the above combined operator.
>
>> These are clear deviations from the standard approach used so far and
would
>> require a fix.
>>
>>
>> *2) Determination of mutation probability*: The newly proposed design of
>> "commons-math4-ga2" determines the probability of mutation at the
algorithm
>> level. Same approach was used in math 3.x implementation. However, this
>> approach considers the probability of mutation at the chromosome level
not
>> at the allele/gene level. I have found a considerable difference in the
>> quality of optimization between two cases. Determining the mutation
>> probability at the gene/allele level has given a
>> considerably better result.
>
>A runnable test case (that creates a comparison) would be quite useful
>to illustrate the feature.
>
>> Usage of mutation probability at the chromosome
>> level would only ensure mutation of a single allele irrespective of
>> probability
>
>?
>In the basic implementation for the "binary" genotype (in class
>"o.a.c.m.ga2.gene.binary.Mutation"), there is a loop over all the
>alleles.
>
>> or chromosome size. There is no such limitation in case the
>> mutation probability is decided at the allele level and can be easily
>> controlled by users for fine tuning. This has helped to improve the
>> optimization quality thus providing better results. This is only related
to
>> mutation not crossover. But we can maintain an uniform approach and let
the
>> operator decide on the probability.
>
>I don't understand.
>Please refer to the class mentioned above and describe the required
>modifications.
-- E.g. assume the user is having a chromosome population of size 10 and
chromosome length is 10.
mutation probability  no of alleles modified per chromosome   no of
alleles modified in population
 .2 2
   20
 .1 1
   10
 .05   --
   5
 

[MATH][GA] Issues in "commons-math4-ga2" design

2022-08-15 Thread Avijit Basak
Hi All

 The newly proposed design of "commons-math4-ga2" has two primary
issues which I would like to mention here.

*1) GA logic*: The design does not conform to the basic genetic algorithm
concepts proposed by John Holland. The pseudocode representing the original
algorithm logic is provided below:
--CUT--
  while(!converged(population)) {
  Population newPopulation = new Population();
  for(int i = 0; i < size(population)/2; i++) {
  // select parents
  ChromosomePair parents = select(population);
  // do crossover
  ChromosomePair offsprings = crossover(parents);
  //do mutation
  Chromosome chromosome1 = mutate(offsprings[0]);
  Chromosome chromosome2 = mutate(offsprings[1]);
  // Add mutated chromosomes to population
  newPopulation.add(chromosome1);
  newPopulation.add(chromosome2);
  }
  }
--CUT--

However, the implementation proposed in "commons-math4-ga2" can be
represented by the pseudocode provided below.
--CUT--
  while(!converged(population)) {
  List operators;
  Population newPopulation = new Population();
  for(int i = 0; i < size(population)/2; i++) {
  for(GeneticOperator operator : operators) {
  // select parents
  ChromosomePair parents = select(population);
  // apply operator
  ChromosomePair offsprings = operator.apply(parents);
  // Add chromosomes to population
  newPopulation.add( offsprings[0] );
  newPopulation.add( offsprings[1] );
  }
  }
  }
--CUT--
N.B. The use of probability and elitism has been avoided to keep the logic
simplified.

The first one has been used by the engineering community for decades
and is proved to be effective. There is also a mathematical model based on
schema theorem(
https://en.wikipedia.org/wiki/Holland%27s_schema_theorem#:~:text=The%20Schema%20Theorem%20says%20that,the%20power%20of%20genetic%20algorithms.)
to support the effectiveness of the algorithm. Same has been followed by me
for implementation of "commons-math4-ga" module.
However, the new design proposed as part of "commons-math4-ga2"
deviates from the basic logic. It does not distinguish the operators i.e.
crossover and mutation and treats them uniformly. The order of
operator application is also not considered. Along with that it executes
parent selection two times instead of one.
These are clear deviations from the standard approach used so far and would
require a fix.


*2) Determination of mutation probability*: The newly proposed design of
"commons-math4-ga2" determines the probability of mutation at the algorithm
level. Same approach was used in math 3.x implementation. However, this
approach considers the probability of mutation at the chromosome level not
at the allele/gene level. I have found a considerable difference in the
quality of optimization between two cases. Determining the mutation
probability at the gene/allele level has given a
considerably better result. Usage of mutation probability at the chromosome
level would only ensure mutation of a single allele irrespective of
probability or chromosome size. There is no such limitation in case the
mutation probability is decided at the allele level and can be easily
controlled by users for fine tuning. This has helped to improve the
optimization quality thus providing better results. This is only related to
mutation not crossover. But we can maintain an uniform approach and let the
operator decide on the probability.

    Please share further thoughts.


Thanks & Regards
-- Avijit Basak


Re: [Math] Review of "genetic algorithm" module

2022-07-03 Thread Avijit Basak
st.
>> >
>> >A class to be used as a key only needs to implement "equals" and
>> >"hashCode".
>> -- The current chromosome class implements Comparable interface which
uses
>> chromosome fitness for comparison. Use of both Comparable and equals()
>> might introduce inconsistencies.
>
>An example?

-- Inconsistency can appear in case we provide a custom implementation of
equals and hashcode following the representation of chromosome or use the
default implementation.
Since Comparable uses the fitness value to compare and as described above
two chromosomes with separate representations can have the same fitness
value this might result in inconsistency.
But in the new module chromosome does not implement Comparable, so there is
no possibility of the same.

>
>> >
>> >> >
>> >> >(6)
>> >> >o.a.c.m.ga.chromsome.AbstractChromosome
>> >> >
>> >> >Field "fitness" is not "final", yet it could be: a "FitnessFunction"
>> >> >object (used in "evaluate() to compute that field) is passed to the
>> >> >constructor.  Is there a reason for the "lazy" evaluation?
>> >> >Dropping it would make the instance immutable (and "evaluate()"
>> >> >should be renamed to "getFitness()").
>> >> >
>> >> >Why should the "FitnessFunction" be stored in every chromosome?
>> >> >
>> >> -- I have modified the fitness as final and initialized the same in
the
>> >> constructor.
>> >
>> >Better, but did you check my proposal in MATH-1618, where
>> >Chromosome and fitness are decoupled, and their relationship
>> >is held within a "Population" instance?
>> --Mentioned earlier.
>
>I still don't know whether you agree that my proposal makes it
>simpler to express a GA.

-- I think there are few points where we are not aligned and those are
mentioned in the summary section.

>
>> >
>> >> [...]
>
>> >
>> >> >
>> >> >(9)
>> >> >Naming of factory methods should be harmonized to match the
convention
>> >> >adopted in components like [RNG] and [Numbers].
>> >> >E.g. instead of "newChromosome(...)", please use "of(...)" or
>> "from(...)"
>> >> >for "value object", and "create(...)" otherwise.
>> >> >
>> >> -- I have renamed the same for Chromosome classes.
>> >> What about the nextGeneration() method of ListPopulation class.
Renaming
>> >> this to create() or from() won't convey the purpose of it.
>> >
>> >I agree, and that's why the new "Population" class (in MATH-1618) does
>> >not provide a factory method (see also the "GeneticAlgorithmFactory"
>> >class).
>> -- We can avoid the same in the current model if we agree to use a
default
>> implementation of population and remove the Population interface
following
>> your new model.
>
>So, do we adopt that "new model"?
>Or do you still have objections?

-- Mentioned above.

>
>> >
>> >> >(10)
>> >> >o.a.c.m.ga.chromosome.AbstractListChromosome
>> >> >
>> >> >Constructor is called with an argument that is a previously
instantiated
>> >> >"representation".  If the latter is mutable, the caller will be able
to
>> >> modify
>> >> >the underlying data structure of the newly created chromosome.  [The
>> >> >doc assumes immutability of the representation but this cannot be
>> >> >enforced, and mixed ownership can entail subtle bugs.]
>> >> -- I think this applies to both representation as well as generic
>> parameter
>> >> type T. But I don't see any other option but to rely on the user.
>> >
>> >The Javadoc (at line 84) is misleading in its mention of "immutable".
>> >
>> >> If you have any suggestions kindly share.
>> >
>> >I may not understand all the implications, but I'd suggest that the
>> >"representation" be instantiated within the control of the library (e.g.
>> >through a "builder"/"factory").
>> -- Currently we have the ChromosomeRepresentationUtils for the same. Its
>> methods are designed to generate the representations.
>
>My suggestion is that this design can be improved (a.o. according to my
>above suggestion).

-- Sure.

>
>> >
>> &g

Re: [Math] Review of "genetic algorithm" module

2022-05-18 Thread Avijit Basak
r, but did you check my proposal in MATH-1618, where
>Chromosome and fitness are decoupled, and their relationship
>is held within a "Population" instance?
--Mentioned earlier.

>
>> >(7)
>> >Spurious "@since" tags: In the new code (in "commons-math-ga"
>> >module), none should refer to a version < 4.0.
>> >
>> -- Some files are taken unchanged from the previous release. I have kept
>> the same @since tag for those files.
>> Do you need any change here?
>
>The old and new files are in different packages; "@since" tags
>thus make no sense IMO.
--I shall change it.

>
>>
>> >(8)
>> >@SuppressWarnings("unchecked")
>> >
>> >By default, I'm a bit suspicious about having to resort to these
>> annotations,
>> >especially for the kind of algorithms we are trying to implement.
>> >What do you think of the alternative approach outlined in the ZIP file
>> >attached in MATH-1618:
>> >https://issues.apache.org/jira/browse/MATH-1618
>> >?
>> -- This annotation is required because we have kept an option to use
>> different types of genotypes including primitive.
>> Because of that our base interfaces only declares phenotype not genotype.
>> This introduced a kind of hierarchy in all operators and chromosome
classes
>> which required us to use the mentioned annotation.
>
>I may again be missing something.
>Could you please explain the case that makes these annotations
>necessary.
-- This has been only used to avoid the warning in the place of typecasting.
However, I can work to minimize this following your new model.

>
>> Even with the proposed new architecture we may not be able to avoid the
>> same.
>
>The classes which I've added do not use the annotation...
>
>> -- It will be good if you can share some more information about the newly
>> proposed architecture. The areas of current design which it can improve
as
>> well as the underlying intention.
>
>As noted in the comment on the JIRA page, the main intention is
>maximal decoupling of functionalities that make up a GA (population,
>fitness, selection, operator) and that seems achieved with the provided
>classes.
>
>> >
>> >(9)
>> >Naming of factory methods should be harmonized to match the convention
>> >adopted in components like [RNG] and [Numbers].
>> >E.g. instead of "newChromosome(...)", please use "of(...)" or
"from(...)"
>> >for "value object", and "create(...)" otherwise.
>> >
>> -- I have renamed the same for Chromosome classes.
>> What about the nextGeneration() method of ListPopulation class. Renaming
>> this to create() or from() won't convey the purpose of it.
>
>I agree, and that's why the new "Population" class (in MATH-1618) does
>not provide a factory method (see also the "GeneticAlgorithmFactory"
>class).
-- We can avoid the same in the current model if we agree to use a default
implementation of population and remove the Population interface following
your new model.

>
>> >(10)
>> >o.a.c.m.ga.chromosome.AbstractListChromosome
>> >
>> >Constructor is called with an argument that is a previously instantiated
>> >"representation".  If the latter is mutable, the caller will be able to
>> modify
>> >the underlying data structure of the newly created chromosome.  [The
>> >doc assumes immutability of the representation but this cannot be
>> >enforced, and mixed ownership can entail subtle bugs.]
>> -- I think this applies to both representation as well as generic
parameter
>> type T. But I don't see any other option but to rely on the user.
>
>The Javadoc (at line 84) is misleading in its mention of "immutable".
>
>> If you have any suggestions kindly share.
>
>I may not understand all the implications, but I'd suggest that the
>"representation" be instantiated within the control of the library (e.g.
>through a "builder"/"factory").
-- Currently we have the ChromosomeRepresentationUtils for the same. Its
methods are designed to generate the representations.

>
>> >
>> >(11)
>> >Do we agree that, in a GA, the most time-consuming task is the fitness
>> >computation?  Hence IMO, it should be the focus of the multithreading
>> >tools (i.e. "ExecutorService"), probably keeping the other parts (namely
>> >the genetic operators) within a simple sequential loop (as in class
>> >"GeneticAlgorithmFactory" in MATH-1618).
>> -- Curren

Re: [Math] Review of "genetic algorithm" module

2022-05-01 Thread Avijit Basak
e tag for those files.
Do you need any change here?

>(8)
>@SuppressWarnings("unchecked")
>
>By default, I'm a bit suspicious about having to resort to these
annotations,
>especially for the kind of algorithms we are trying to implement.
>What do you think of the alternative approach outlined in the ZIP file
>attached in MATH-1618:
>https://issues.apache.org/jira/browse/MATH-1618
>?
-- This annotation is required because we have kept an option to use
different types of genotypes including primitive.
Because of that our base interfaces only declares phenotype not genotype.
This introduced a kind of hierarchy in all operators and chromosome classes
which required us to use the mentioned annotation.
Even with the proposed new architecture we may not be able to avoid the
same.
-- It will be good if you can share some more information about the newly
proposed architecture. The areas of current design which it can improve as
well as the underlying intention.

>
>(9)
>Naming of factory methods should be harmonized to match the convention
>adopted in components like [RNG] and [Numbers].
>E.g. instead of "newChromosome(...)", please use "of(...)" or "from(...)"
>for "value object", and "create(...)" otherwise.
>
-- I have renamed the same for Chromosome classes.
What about the nextGeneration() method of ListPopulation class. Renaming
this to create() or from() won't convey the purpose of it.

>(10)
>o.a.c.m.ga.chromosome.AbstractListChromosome
>
>Constructor is called with an argument that is a previously instantiated
>"representation".  If the latter is mutable, the caller will be able to
modify
>the underlying data structure of the newly created chromosome.  [The
>doc assumes immutability of the representation but this cannot be
>enforced, and mixed ownership can entail subtle bugs.]
-- I think this applies to both representation as well as generic parameter
type T. But I don't see any other option but to rely on the user.
If you have any suggestions kindly share.

>
>(11)
>Do we agree that, in a GA, the most time-consuming task is the fitness
>computation?  Hence IMO, it should be the focus of the multithreading
>tools (i.e. "ExecutorService"), probably keeping the other parts (namely
>the genetic operators) within a simple sequential loop (as in class
>"GeneticAlgorithmFactory" in MATH-1618).
-- Current implementation uses separate threads for applying crossover and
mutation operators for each pair of selected chromosomes.
I think this ensures better utilization of multi-core processors compared
to use of multi-threading only for the fitness calculation.

-- Some codes are checked in. But there is a conflict in the pull request.
So I shall create a new one and delete the old branch itself.


Thanks & Regards
--Avijit Basak


On Fri, 15 Apr 2022 at 03:03, Gilles Sadowski  wrote:

> Hello.
>
> > > > [...]
>
> (1)
> o.a.c.m.ga.GeneticAlgorithmTestPermutations
> (under "src/test")
>
> As per your comment in that class, it is a usage example.
> Given that its name does not end with "Test", it is not run by the
> test suite.  Please move it to the "examples" module.
>
> (2)
> I'm missing a high-level doc that would enable a newbie to figure
> out what to implement in order to get going.
> E.g. what is the interplay between
>  * genotype
>  * allele
>  * phenotype
>  * decoder
>  * fitness function
> ?
> Several classes do not provide explanations (or links) about the
> concept which they represent.  For example, there is no doc about
> what a "RandomKeyDecoder" is, and the reason for using it (or not).
>
> (3)
> o.a.c.m.ga.utils.ChromosomeRepresentationUtils
>
> It seems to be a "mixed-bag" kind of class (that is being frowned
> upon nowadays).
> Its comment refers to "random" but some methods are not using
> any randomization.  Most methods are only used in unit tests.
>
> (4)
> o.a.c.m.ga.RandomProviderManager
>
> As already discussed, this class should not be part of the public
> API, namely because the "getRandomProvider()" method returns
> an object that is not thread-safe.
> If used internally as "syntactic sugar", it should be located in a
> package named "internal"; however I'd tend to remove it
> altogether, and call "ThreadLocalRandomSource.current(...)"
> explicitly.
>
> (5)
> Why does a "Chromosome" need an "identifier"?
> Method "getId()" is only used in "PopulationStatisticalSummaryImpl"
> that is an internal class, where it seems that the chromosome itself
> (rather than its "id") could serve as the map's key.

Re: [Math] Review of "genetic algorithm" module

2022-04-10 Thread Avijit Basak
pplication layer).  [This would allow the removal of
>   "updateListenerRigistry" method (note: There is a typo in that name).]
--This is corrected.

>* Are annotations (@SafeVarargs, ...) necessary?  Please document.
-- This annotation is necessary for any parameterized vararg. This is also
used in legacy classes like o.a.c.m.l.a.i.FieldHermiteInterpolator and
o.a.c.m.l.o.n.RungeKuttaFieldStepInterpolator.

>In "AdaptiveGeneticAlgorithm":
>* There should be a single constructor (same remark as above).
-- Removed the constructor with default argument.

>* Why the use of reflection ("isAssignableFrom")?
-- Replaced it by instanceof.
-- Created a new PR https://github.com/apache/commons-math/pull/209.


Thanks & Regards
--Avijit Basak

On Sun, 3 Apr 2022 at 19:52, Gilles Sadowski  wrote:

> Hello.
>
> Le mar. 29 mars 2022 à 17:08, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  Please find my comments below.
> >
> > [...]
> >
> > --I have made the changes and created a new PR. Kindly review the same
> and
> > share your thoughts.
> > https://github.com/apache/commons-math/pull/208
>
> I've merged PR #208 into the feature branch (please open a
> new one for changes entailed by the comments below).
> I again had to delete the branch (and recreate it with the merged
> changes from PR #208).  [I must be missing something about the
> correct git workflow...]
>
> There seems to be something wrong in the "examples-ga-tsp"
> application (fitness does not change).
>
> At the end of the run, one should be able to quickly assess the
> goodness of the solution; the new code prints a line with many
> "Node [...]" elements while the "--legacy" switch prints the "best"
> fitness and a list of indices.  In either case, the solution should
> consist of the list of visited cities (one per line) and the total
> distance.
>
> I can't seem to find how the logger is configured.  Currently, all
> "INFO" messages are logged to the "standard error" console; one
> should be able to e.g. redirect output to a file, or set the log level.
>
> There is still a mix between library code and application code (but
> this is to be discussed in MATH-1643.
>
> From browsing the library code, I'm tempted to believe that the
> dependency towards a logging framework is not necessary (or
> underused).  I think that such a feature could be left to the application
> layer (per the "ConvergenceListener" registry).
> Likewise, the "PopulationStatisticsLogger" is not general enough to
> be worth being part of the library.
>
> A few (nit-pick) remarks about code style in general.
> Javadoc is incomplete: All methods must be documented.
> Please avoid redundant links like e.g.
> ---CUT---
> /**
>  * @param crossoverPolicy  The {@link CrossoverPolicy}
>  * @param mutationPolicy   The {@link MutationPolicy}
>  * @param selectionPolicy  The {@link SelectionPolicy}
>  * @param convergenceListeners An optional collection of
>  * {@link ConvergenceListener} with
> variable arity
>  */
> @SafeVarargs
> protected AbstractGeneticAlgorithm(final CrossoverPolicy
> crossoverPolicy,
> final MutationPolicy mutationPolicy,
> final SelectionPolicy selectionPolicy,
> ConvergenceListener... convergenceListeners) {
> this.crossoverPolicy = crossoverPolicy;
> this.mutationPolicy = mutationPolicy;
> this.selectionPolicy = selectionPolicy;
> updateListenerRigistry(convergenceListeners);
> }
> ---CUT---
> Readers of the HTML-generated doc can already click on the various
> arguments within the signature; so there is no need to add visual noise
> in the source code just to be able to click from within the Javadoc part
> just above that signature.
> The Javadoc block above should be
> ---CUT---
> /**
>  * @param crossoverPolicy Crossover policy.
>  * @param mutationPolicy Mutation policy.
>  * @param selectionPolicy Selection policy.
>  * @param convergenceListeners Collection of user-defined listeners.
>  */
> ---CUT---
> [Note the absence of "The" and the presence of a final "period".]
>
> A blank line is welcome to separate ideas ("logical" blocks of code)
> However, there should not be an empty line after a closing brace if
> it is followed by another closing brace.
> Also, in all recent codes, there is no blank line between the instance
> fields; the (mandatory) Javadoc is enough to logically (and visually)
> separate the fields.
>
> In "AbstractGeneticAlgorithm":
> * There should be a single constructor (handling default values should
>be left to the application layer).  [This would allow the removal of
>"updateListenerRigistry" method (note: There is a typo in that name).]
> * Are annotations (@SafeVarargs, ...) necessary?  Please document.
>
> In "AdaptiveGeneticAlgorithm":
> * There should be a single constructor (same remark as above).
> * Why the use of reflection ("isAssignableFrom")?
>
> Regards,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [Math] Review of "genetic algorithm" module

2022-03-29 Thread Avijit Basak
Hi All

 Please find my comments below.

[...]
>Just quickly commenting on this point.

>IIUC, your purpose is for users to be able to run (an example
>application of) the old implementation.
>
>This can be achieved by having all the "legacy" codes within
>module
>  commons-math-examples/examples-ga/examples-ga-math-functions
>(note: No "legacy" in the module's name), within a dedicated
>  o.a.c.m.examples.ga.mathfunctions.legacy
>package.
>
>This code is then called by the exact same code/application as
>for the new implementation (with the corresponding command
>line switch):
>  $ java -jar examples-ga-app.jar --legacy ... rest of the args ...
>
>Users can thus perform 2 runs; once with "--legacy" and one
>without it, and reach some conclusions.
>
>The duplicate codes only bring maintenance burden (to ensure
>that the "legacy" and non-"legacy" modules do indeed aim at
>solving the same problem).
>Whenever we then decide that the new code has been thoroughly
>tested, removal of the
>  o.a.c.m.examples.ga.mathfunctions.legacy
>package will be a minimal change (as compared to the removal
>of a module)

--I have made the changes and created a new PR. Kindly review the same and
share your thoughts.
*https://github.com/apache/commons-math/pull/208
<https://github.com/apache/commons-math/pull/208>*


Thanks & Regards
--Avijit Basak



On Mon, 28 Mar 2022 at 18:36, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 28 mars 2022 à 10:15, Avijit Basak  a
> écrit :
> >
> > [...]
> >
> > >The various "Standalone" classes also look quite similar; consolidating
> the
> > >"examples-ga" module (including full Javadoc) is necessary.
> > -- Could you please elaborate it more. IMHO as StandAlone classes are
> > dedicated to the specific module only, it would remain separate. Since we
> > have used a single domain to show utility of the different
> > types(adaptive/simple) of GA some classes have become similar.
> >
> > >I still don't
> > >understand why there are "...-legacy" modules in module "examples-ga".
> > >If you want to offer the option of running the "old" implementation, you
> > >could add a "legacy" flag (as "@Option" in the "Standalone"
> application).
> > -- There was a discussion on this some time back. The sole purpose of
> > keeping the legacy example module is for comparison with the new
> > implementation. It will be easier for anyone to visualize the quality
> > improvement we achieved here. I don't want to mix(by legacy flag) this
> > anyway with the new implementation.
> >
>
> Just quickly commenting on this point.
>
> IIUC, your purpose is for users to be able to run (an example
> application of) the old implementation.
>
> This can be achieved by having all the "legacy" codes within
> module
>   commons-math-examples/examples-ga/examples-ga-math-functions
> (note: No "legacy" in the module's name), within a dedicated
>   o.a.c.m.examples.ga.mathfunctions.legacy
> package.
>
> This code is then called by the exact same code/application as
> for the new implementation (with the corresponding command
> line switch):
>   $ java -jar examples-ga-app.jar --legacy ... rest of the args ...
>
> Users can thus perform 2 runs; once with "--legacy" and one
> without it, and reach some conclusions.
>
> The duplicate codes only bring maintenance burden (to ensure
> that the "legacy" and non-"legacy" modules do indeed aim at
> solving the same problem).
> Whenever we then decide that the new code has been thoroughly
> tested, removal of the
>   o.a.c.m.examples.ga.mathfunctions.legacy
> package will be a minimal change (as compared to the removal
> of a module).
>
> Regards,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Math] Review of "genetic algorithm" module

2022-03-28 Thread Avijit Basak
Hi All

   Please find my comments.

[...]
>I don't think that it's the right way to go; instantiating an
"ExecutorService"
>belongs to the GA application, not the GA library (whose relevant classes
>need "only" be thread-safe).
>There is some misunderstanding to be clarified in a dedicated discussion
>(please file a new JIRA ticket).
-- I have created a subtask under the same Jira(MATH-1563). Please share
your thoughts.
https://issues.apache.org/jira/browse/MATH-1643

>Side note: Conflicts and duplicate commits have accumulated in the
>dedicated "feature__MATH-1563__genetic_algorithm" branch.
>I did not know how to proceed in order to avoid ending up with a messy
>history in "master"; so I created a new branch (with the same name) with
>all the new GA-related files added as a single commit.
>Currently, this branch (based on your PR #205) fails the default goal,
>because of a CheckStyle issue.  You shoudl always check locally that
>running "mvn" without arguments does not generate any errors.

>I also noticed that classes in "examples-ga" use "forbidden" library
>classes: "GeneticIllegalArgumentException" is an "internal" class; we
>must not advertize such classes in the example applications.
-- I have replaced the "GeneticIllegalArgumentException" by
"IllegalArgumentException".

>In general, it seems that "examples-ga" contains several classes and
>methods that do not need to be "public".  This is especially true for
>classes like "MathFunction" and "Coordinate".  [Having those "private"
>helps users to tell what is part of the library's functionality from what
is
>just "dummy" placeholder code.]
-- I have replaced the MathFunction and CoordinateDecoder with lambda.
However, the Coordinate class is a domain object (phenotype). So this needs
to remain public. This can be used in more than one place for the entire
application.

>Finally (for now), I've just noticed that there exist several classes named
>"MathFunction", with same implementation!
>Code duplication must be avoided, especially where we purport to display
>best practices.
-- As mentioned above this has been removed.

>The various "Standalone" classes also look quite similar; consolidating the
>"examples-ga" module (including full Javadoc) is necessary.
-- Could you please elaborate it more. IMHO as StandAlone classes are
dedicated to the specific module only, it would remain separate. Since we
have used a single domain to show utility of the different
types(adaptive/simple) of GA some classes have become similar.

>I still don't
>understand why there are "...-legacy" modules in module "examples-ga".
>If you want to offer the option of running the "old" implementation, you
>could add a "legacy" flag (as "@Option" in the "Standalone" application).
-- There was a discussion on this some time back. The sole purpose of
keeping the legacy example module is for comparison with the new
implementation. It will be easier for anyone to visualize the quality
improvement we achieved here. I don't want to mix(by legacy flag) this
anyway with the new implementation.

>Please use the new branch for all these ("cleanup") changes, as the basis
>a PR (with a *single* commit).
-- I have taken the changes and will create a new PR soon with all my
changes.


Thanks & Regards
--Avijit Basak

On Sun, 13 Mar 2022 at 06:39, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 28 févr. 2022 à 07:11, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please see my comments below.
> >
> > > [...]
> > >I just had a very quick look.
> > >IIUC, you always provide "convenience" methods (e.g. the various
> > >signatures for the "evolve" functionality).
> > >Prior to merging into "master", we should simplify and limit the
> > >discussion to the core functionality, i.e. not try and make decisions
> > >for the user (like default values, ...).  Please keep the API as simple
> > >as possible
> > -- I have removed the mentioned evolve method.
> > However, I had to catch two checked exceptions (InterruptedException,
> > ExecutionException) and rethrow them. As of now I have handled them using
> > the GeneticIllegalArgumentException. I think we need to introduce another
> > exception class to handle this. Please share your thought regarding this.
>
> I don't think that it's the right way to go; instantiating an
> "ExecutorService"
> belongs to the GA application, not the GA library (whose relevant classes
> need &quo

Re: [Math] Review of "genetic algorithm" module

2022-02-27 Thread Avijit Basak
Hi All

Please see my comments below.

> [...]
>I just had a very quick look.
>IIUC, you always provide "convenience" methods (e.g. the various
>signatures for the "evolve" functionality).
>Prior to merging into "master", we should simplify and limit the
>discussion to the core functionality, i.e. not try and make decisions
>for the user (like default values, ...).  Please keep the API as simple
>as possible
-- I have removed the mentioned evolve method.
However, I had to catch two checked exceptions (InterruptedException,
ExecutionException) and rethrow them. As of now I have handled them using
the GeneticIllegalArgumentException. I think we need to introduce another
exception class to handle this. Please share your thought regarding this.

Thanks & Regards
--Avijit Basak



On Mon, 21 Feb 2022 at 20:11, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 21 févr. 2022 à 06:56, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please find my comments below:
> >
> > [...]
> > >
> > >Another misunderstanding (probably); we must figure out where
> > >the parallelism will be implemented.
> > >IIUC the current state of the code, optimizing multiple populations
> > >in parallel would be the same as launching multiple JVMs; I'd want
> > >to explore low-level parallelism (i.e. at the "Chromosome" level).
> > -- I have implemented both muti-threading and multi-population
> parallelism.
>
> I just had a very quick look.
> IIUC, you always provide "convenience" methods (e.g. the various
> signatures for the "evolve" functionality).
> Prior to merging into "master", we should simplify and limit the
> discussion to the core functionality, i.e. not try and make decisions
> for the user (like default values, ...).  Please keep the API as simple
> as possible.
>
> Thanks,
> Gilles
>
> >>>> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Math] Review of "genetic algorithm" module

2022-02-20 Thread Avijit Basak
Hi All

Please find my comments below:

>The build fails because of CheckStyle errors:
>https://app.travis-ci.com/github/apache/commons-math/builds/246683712
--Fixed the issues

>>>> [...]
>> >> >
>> >> >I did not suggest to remove any Javadoc, only to rephrase it as:
[...]
>> >As hinted by my comment is the previous message, I've still to
>> >clarify my own expectations; but I vaguely sense some lost
>> >opportunity for simpler usage simpler and increased performance
>> >through the caller just needs to specify the number of "worker
>> >threads".
>> -- We should have both options. Users can execute the algorithm by
>> specifying only the number of worker threads with a single population as
>> well as optimize multiple populations in a parallel fashion.
>
>Another misunderstanding (probably); we must figure out where
>the parallelism will be implemented.
>IIUC the current state of the code, optimizing multiple populations
>in parallel would be the same as launching multiple JVMs; I'd want
>to explore low-level parallelism (i.e. at the "Chromosome" level).
-- I have implemented both muti-threading and multi-population parallelism.

>
>> For the multi-population option the common thread pool would be reused
for
>> all populations.
>> >
>> >Do we at least agree that
>> >1. Adding/retrieving a "Chromosome" to/from a "Population"
>> >must be thread-safe (and is not trivial)
>> >2. Fitness computation is where most time is usually spent
>> >(so that multi-threading must be achieved at that granularity)
>> >?
>> --The way I am thinking of designing a task is that it should accept the
>> current population and return an instance of ChromosomePair.
>> The chromosomes within this pair would be added to the population by the
>> caller thread. Population won't be updated by multiple threads.
>
>Then, it would not be a multi-threaded library.
>This kind of parallelism does not need support from the library,
>and can be implemented at the application level.
>
>> The code snippet below shows the body of the method which will be
executed
>> inside the task.
>> --CUT--
>>
>> //selection
>> ChromosomePair pair = getSelectionPolicy().select(current);
>>
>> // crossover
>> if (randGen.nextDouble() < getCrossoverRate()) {
>> // apply crossover policy to create two offspringoport
>> pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond());
>> }
>>
>> // mutation
>> if (randGen.nextDouble() < getMutationRate()) {
>> // apply mutation policy to the chromosomes
>> pair = new ChromosomePair(
>> getMutationPolicy().mutate(pair.getFirst()),
>> getMutationPolicy().mutate(pair.getSecond()));
>> }
>>
>> return pair;
>>
>> --CUT--
>
>One of the issue with above code is, again, that "randGen"
>must be thread-safe (and it is not, usually).
>Also, it doesn't say anything about how to ensure that
>the fitness computation is thread-safe (and if you assume
>that it will be computed outside that "task", then the
>performance gain will be very low).
>
-- Current implementation is using a Thread local version of random number
generator.

>> >
>> >I'd surmise that "multiple instances of AbstractGeneticAlgorithm"
>> >is an application concern; unless I'm missing something, it's
>> >not what I've in mind when talking about multi-threading.
>> >Actually, I was wondering whether we could implement the
>> >analog of what is in the "commons-math-neuralnet" module,
>> >where
>> >* "Neuron" is the counterpart "Chromosome"
>> >* "Network" is the counterpart of "Population".
>> --"multiple instance of AbstractGeneticAlgorithm" is related to parallel
GA
>> with multiple populations not multi-threading.
>
>Yes, as I also mentioned above.
>But I'm interested in where multi-threading can be implemented to
>be used in both cases (single population and multiple populations).
>
>> Users can also implement parallel GA in a synchronous manner although
that
>> won't be a recommended way.
>> Multi-threading is only a way to improve performance using a user's multi
>> core CPU.
>> The threads in the thread pool would only be used to execute the task as
>> mentioned in the previous comment.
>
>That's where I've some doubt.
>But be free to implement benchmarks that demonstrate the
>expected performance improvement.
>
>> I think we hav

Re: [Math] Review of "genetic algorithm" module

2022-02-18 Thread Avijit Basak
es the population using multiple threads.
>> >> --This needs to be done. However,  I would like to address this along
>> with
>> >> parallel GA i.e. convergence of multiple populations together.
>> >
>> >The two features (multi-thread vs multiple populations) should
>> >be implemented independently:  Users that only need the "basic"
>> >GA should also be able to take advantage of their machine's
>> >multiple CPUs.
>> >[This is related to the design issue which I mentioned previously.]
>> >
>> -- I am thinking to leverage user's multiple CPUs for doing
>> multi-population GA.
>
>OK (sort-of, since "the devil is in the details", and I'm not sure
>that we mean the same thing by "multi", see below).
>
>> It would a global approach where same thread pool
>> would be used for both purposes. Another class would be introduced for
>> executing parallel genetic algorithm which would accept multiple
instances
>> of AbstractGeneticAlgorithm class and converge them in parallel. Users
who
>> does not care for robustness would go for current implementations of the
>> algorithm with single population. For a better optimization quality users
>> would chose the new class.
>
>As hinted by my comment is the previous message, I've still to
>clarify my own expectations; but I vaguely sense some lost
>opportunity for simpler usage simpler and increased performance
>through the caller just needs to specify the number of "worker
>threads".
-- We should have both options. Users can execute the algorithm by
specifying only the number of worker threads with a single population as
well as optimize multiple populations in a parallel fashion.
For the multi-population option the common thread pool would be reused for
all populations.
>
>Do we at least agree that
>1. Adding/retrieving a "Chromosome" to/from a "Population"
>must be thread-safe (and is not trivial)
>2. Fitness computation is where most time is usually spent
>(so that multi-threading must be achieved at that granularity)
>?
--The way I am thinking of designing a task is that it should accept the
current population and return an instance of ChromosomePair.
The chromosomes within this pair would be added to the population by the
caller thread. Population won't be updated by multiple threads.
The code snippet below shows the body of the method which will be executed
inside the task.
--CUT--

//selection
ChromosomePair pair = getSelectionPolicy().select(current);

// crossover
if (randGen.nextDouble() < getCrossoverRate()) {
// apply crossover policy to create two offspring
pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond());
}

// mutation
if (randGen.nextDouble() < getMutationRate()) {
// apply mutation policy to the chromosomes
pair = new ChromosomePair(
getMutationPolicy().mutate(pair.getFirst()),
getMutationPolicy().mutate(pair.getSecond()));
}

return pair;

--CUT--

>
>I'd surmise that "multiple instances of AbstractGeneticAlgorithm"
>is an application concern; unless I'm missing something, it's
>not what I've in mind when talking about multi-threading.
>Actually, I was wondering whether we could implement the
>analog of what is in the "commons-math-neuralnet" module,
>where
>* "Neuron" is the counterpart "Chromosome"
>* "Network" is the counterpart of "Population".
--"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA
with multiple populations not multi-threading.
Users can also implement parallel GA in a synchronous manner although that
won't be a recommended way.
Multi-threading is only a way to improve performance using a user's multi
core CPU.
The threads in the thread pool would only be used to execute the task as
mentioned in the previous comment.

I think we have some misunderstanding over here. It is better to do an
implementation first and start the discussion.
It would be more productive.


Thanks & Regards
--Avijit Basak

On Thu, 17 Feb 2022 at 01:09, Gilles Sadowski  wrote:

> Hello.
>
> Le mer. 16 févr. 2022 à 17:31, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please find my comments.
> >
> > >> (2)
> > >> >The "GeneticException" class seems to mostly deal with "illegal"
> > >> >arguments; hence it should be a subclass of the JDK's standard
> > >> >"IllegalArgumentException" (and be renamed accordingly).
> > >> >If other condition types are needed, then another internal class
> > >> >should be defined with the corresponding standard semantics.
> > >> --IMHO if we think of a single excep

Re: [Math] Review of "genetic algorithm" module

2022-02-16 Thread Avijit Basak
o be done. However,  I would like to address this along
with
>> parallel GA i.e. convergence of multiple populations together.
>
>The two features (multi-thread vs multiple populations) should
>be implemented independently:  Users that only need the "basic"
>GA should also be able to take advantage of their machine's
>multiple CPUs.
>[This is related to the design issue which I mentioned previously.]
>
-- I am thinking to leverage user's multiple CPUs for doing
multi-population GA. It would a global approach where same thread pool
would be used for both purposes. Another class would be introduced for
executing parallel genetic algorithm which would accept multiple instances
of AbstractGeneticAlgorithm class and converge them in parallel. Users who
does not care for robustness would go for current implementations of the
algorithm with single population. For a better optimization quality users
would chose the new class.

[...]

Thanks & Regards
--Avijit Basak

On Mon, 14 Feb 2022 at 15:37, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 14 févr. 2022 à 08:03, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Thanks for the review comments. Please find my comments below.
> >
> > (1)
> > [...]
> >
> > (2)
> > >The "GeneticException" class seems to mostly deal with "illegal"
> > >arguments; hence it should be a subclass of the JDK's standard
> > >"IllegalArgumentException" (and be renamed accordingly).
> > >If other condition types are needed, then another internal class
> > >should be defined with the corresponding standard semantics.
> > --IMHO if we think of a single exception class we should extend it only
> > from RuntimeException.
>
> "single exception class" is not a requirement (it cannot be since
> we agreed some time ago that it was better to align with the JDK's
> delineation of error conditions (IAE, NPE, ILSE, AE, ...).
>
> > If we think of multiple exception classes in one
> > module we may need to think of a base exception class. Other classes
> would
> > extend the same.
>
> Please no.  We'd taken that approach in "Commons Math" (cf.
> base class now in module "commons-math-legacy-exception"),
> as I've mentioned already IIRC: It was a failed experiment IMO.
> [For more details, please refer to the archive of the "dev" ML.]
>
> > The approach mentioned above would mix up these two.
> > Please share your opinion regarding this.
>
> Eventually, all new components ([RNG], [Number], [Geometry], ...)
> adopted the simple approach of non-public API (ideally private
> or package-private) exception classes only for the developer's
> use (and the purpose of which is limited to avoiding duplication).
>
> >
> > >[Exception messages need review for spelling and formatting.]
> > -- It will be really helpful if you can point out some specific examples.
>
> We can fix this when the PR has reached some stability.
>
> >
> > (3)
> > >IMO Javadoc should avoid redundant phrases like "This class" as
> > >the first words of a class description.
> > --Refractored the javadoc comments. Please review and mention if you need
> > any further changes.
>
> I've not looked yet, but thanks for taking it into account.
> Similarly to the previous point, these clean-ups can happen later.
>
> > >A similar remark holds for fields in "GeneticException" class: Since
> > >the name of the field is self-documenting, duplication in the Javadoc
> > >is visual noise ("Message template" is concise and clear enough).
> > --Removal of the javadoc comments produces a checkstyle error.
>
> I did not suggest to remove any Javadoc, only to rephrase it as:
> ---CUT---
> /** "Message template". */
> ---CUT---
>
> > [...]
> >
> > (4)
> > >Class "ConvergenceListenerRegistry" is generic but its code
> > >contains undocumented "@SuppressWarnings" annotations.
> > >Moreover, it is a singleton, and not thread-safe.
> > >Why should there be such a global "registry"?
> > >Since it is only accessed by the "AbstractGeneticAlgorithm" class,
> > >it could be defined as a private inner class.
> > --Made it a private inner class.
>
> Thanks.
> [We should nevertheless address the other issues mentioned in
> the above paragraph.]
>
> >
> > (5)
> > >In class "AbstractGeneticAlgorithm", methods "getCrossoverPolicy"
> > >"getMutationPolicy", "getElitismRate" are public, yet they are

Re: [Math] Review of "genetic algorithm" module

2022-02-13 Thread Avijit Basak
Hi All

Thanks for the review comments. Please find my comments below.

(1)
>A commit log message should strive to be informative
>for the reviewer; saying the like of "fixed minor bugs" does
>not convey anything.
>Even minor changes, like e.g. formatting cleanup, should be
>designated as such.
>For this PR, the message (which I've amended) was misleading
>because the change was not about bugs, but about removing
>GUI code (and its dependency).
--I have maintained a detailed commit message this time.

(2)
>The "GeneticException" class seems to mostly deal with "illegal"
>arguments; hence it should be a subclass of the JDK's standard
>"IllegalArgumentException" (and be renamed accordingly).
>If other condition types are needed, then another internal class
>should be defined with the corresponding standard semantics.
--IMHO if we think of a single exception class we should extend it only
from RuntimeException. If we think of multiple exception classes in one
module we may need to think of a base exception class. Other classes would
extend the same. The approach mentioned above would mix up these two.
Please share your opinion regarding this.

>[Exception messages need review for spelling and formatting.]
-- It will be really helpful if you can point out some specific examples.

(3)
>IMO Javadoc should avoid redundant phrases like "This class" as
>the first words of a class description.
--Refractored the javadoc comments. Please review and mention if you need
any further changes.
>A similar remark holds for fields in "GeneticException" class: Since
>the name of the field is self-documenting, duplication in the Javadoc
>is visual noise ("Message template" is concise and clear enough).
--Removal of the javadoc comments produces a checkstyle error.
>Similarly, simple accessors don't need the exact same sentence
>repeated twice (a single "@return ..." tag is sufficient).
--Modified.

(4)
>Class "ConvergenceListenerRegistry" is generic but its code
>contains undocumented "@SuppressWarnings" annotations.
>Moreover, it is a singleton, and not thread-safe.
>Why should there be such a global "registry"?
>Since it is only accessed by the "AbstractGeneticAlgorithm" class,
>it could be defined as a private inner class.
--Made it a private inner class.

(5)
>In class "AbstractGeneticAlgorithm", methods "getCrossoverPolicy"
>"getMutationPolicy", "getElitismRate" are public, yet they are only
>ever called by a subclass.
--Modified the public to protected.

(6)
>Why support inheritance for "AbstractGeneticAlgorithm"?
>Why would users need their own subclass, rather than call those
>implemented within the library (currently, "GeneticAlgorithm" and
>"AdaptiveGeneticAlgorithm")?
>Couldn't we encapsulate the choice of algorithm in an "enum",
>similar to "RandomSource" in [RNG].
>Do I understand correctly that the (only?) difference between the
>two classes is the ability to adapt crossover and mutation rates?
-- The difference between GeneticAlgorithm and AdaptiveGeneticAlgorithm is
the ability to adapt crossover and mutation probability. However,  as per
my understanding enum encapsulation is appropriate with the same set and
type of constructor arguments, where the arguments can be provided during
enum declaration. In our case the arguments would be provided by the client
program and cannot be pre-initialized as part of an enum declaration.

(7)
>The currently available GA implementations are sequential.
>IIUC, the "nextGeneration" methods should provide an option
>that processes the population using multiple threads.
--This needs to be done. However,  I would like to address this along with
parallel GA i.e. convergence of multiple populations together.

(8)
>Do not use explicit "\n" and "\r" characters.[1]
--Done


Thanks & Regards
--Avijit Basak


On Mon, 7 Feb 2022 at 07:57, Gilles Sadowski  wrote:

> Hello.
>
> A few remarks (as of PR #205) and questions:
>
> (1)
> A commit log message should strive to be informative
> for the reviewer; saying the like of "fixed minor bugs" does
> not convey anything.
> Even minor changes, like e.g. formatting cleanup, should be
> designated as such.
> For this PR, the message (which I've amended) was misleading
> because the change was not about bugs, but about removing
> GUI code (and its dependency).
>
> (2)
> The "GeneticException" class seems to mostly deal with "illegal"
> arguments; hence it should be a subclass of the JDK's standard
> "IllegalArgumentException" (and be renamed accordingly).
> If other condition t

Re: [MATH][GA] Build Failure for PR #204

2022-02-05 Thread Avijit Basak
Hi

 Please see my comments below.

[...]

>Please note that I don't suggest that you remove the tracking of
>the optimization process (it is useful to have a trace in order to
>check that evolution proceeds as expected), instead of displaying
>a GUI, you can save snapshots (either in text form or, if the
>check is more easily done graphically, by using the "[Imaging]"
>component[3]).
-- As per the suggestion I have removed the GUI display of the convergence
process. Instead the default log based tracker has been kept for
convergence traceability.
I have created a new PR#205 after rebase and closed the old one(PR#204).

Thanks & Regards
--Avijit Basak

On Wed, 2 Feb 2022 at 19:58, Gilles Sadowski  wrote:

> Hi.
>
> Le mer. 2 févr. 2022 à 09:29, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please see my comments below.
> >
> > [...]
> >
> >
> > And there was this old issue that the "" should contain
> > the name of the top-level package, i.e. "math4", not "math".
> > -- There was a review comment for PR#197 to remove 4 from artifactid.
> > "aherbert <https://github.com/aherbert> on Sep 25, 2021
> > <https://github.com/apache/commons-math/pull/197#discussion_r716075956>
> >
> > Remove the 4 from math4. The version is specified separately from the
> > artifact ID."
>
> Indeed, it seems that there are discrepant expectations or a
> misunderstanding about how to compose the "".
> In "Commons Math", it contains "math4" as (IIUC) a unique
> identifier of the top-level package (that is updated with every
> major version).  Because of that latter convention, it is true that
> the "4" is redundant with the (major) version number.
> However, it could also be construed that the redundancy may
> be useful for stressing that artefacts with different major versions
> can be used together (without "JAR hell").
> That view of having the "packageId" as part of the artifact's name
> is used in some other components (e.g. "[Lang]"[1]) but not all
> (e.g. "[IO]"[2])...
>
> >
> > I've updated the feature branch with those changes. Please rebase.
> >
> > I've not yet looked at the code, but a question arose from looking at
> > the dependencies: What is "jfreechart" used for in the "examples"?
> > -- jfreechart is used to do a graphical plot of the optimization process.
> >
> > I've just updated the "k-means" example, removing the GUI along
> > the way.  In general, I think that the example applications should
> > follow the KISS principle (which here translates to:  Only write to the
> > console or to files).  Since we don't intend to write full-fledged
> > applications, building/testing should be as smooth as possible: GUIs
> > entail unnecessary hassle for someone working from a remote
> > (text) terminal.
> > -- I shall remove that and the corresponding part of the code.
>
> Thanks.
> Please note that I don't suggest that you remove the tracking of
> the optimization process (it is useful to have a trace in order to
> check that evolution proceeds as expected), instead of displaying
> a GUI, you can save snapshots (either in text form or, if the
> check is more easily done graphically, by using the "[Imaging]"
> component[3]).
>
> Regards,
> Gilles
>
> > [...]
> >
>
> [1]
> https://gitbox.apache.org/repos/asf?p=commons-lang.git;a=blob;f=pom.xml;h=4f12fdf537fd56a69d1b94567e22de99761ec775;hb=HEAD#l28
> [2]
> https://gitbox.apache.org/repos/asf?p=commons-io.git;a=blob;f=pom.xml;h=8f61ca0177a056a80dda656dbb70a9774adac548;hb=HEAD#l26
> [3] See e.g. the "kmeans/image" module.
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> > On Tue, 1 Feb 2022 at 05:24, Gilles Sadowski 
> wrote:
> >
> > > Hello.
> > >
> > > Le lun. 31 janv. 2022 à 06:27, Avijit Basak  a
> > > écrit :
> > > >
> > > > Hi All
> > > >
> > > > Please find my comments below.
> > > >
> > > > >There is no attachment (I think that the ML manager strips those).
> > > > >Please copy/paste the relevant part of the console log (or provide
> > > > >a link to it).
> > > > --The build was done locally with a fresh clone of the feature
> branch.
> > >
> > > Strange that the "pom.xml" in PR #204 still refers to version 1.0 of
> > > Commons Numbers, instead of version 1.1-SNAPSHOT.
> > > This cre

Re: [MATH][GA] Build Failure for PR #204

2022-02-02 Thread Avijit Basak
Hi All

Please see my comments below.

>Strange that the "pom.xml" in PR #204 still refers to version 1.0 of
>Commons Numbers, instead of version 1.1-SNAPSHOT.
>This creates many "NoClassDefFound" errors that were fixed with
>commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch
>"feature__MATH-1563__genetic_algorithm" branch 6 days ago.
-- I could not work further on PR#204. As there was an issue with the
local build, I did not try to merge any further changes.

Anyways, after fetching your PR and rebasing on that branch, the
build is successful.
--Thanks for the confirmation.

Nevertheless, I had to fix/consolidate many POM files that contained
a slew of duplicate declarations (the "dependency management" is
done at the highest possible level, to ensure version consistency).
Also, please use the same formatting rules as in existing files (in
POM files, the indentation is 2 spaces).
-- I have missed these two points.

And there was this old issue that the "" should contain
the name of the top-level package, i.e. "math4", not "math".
-- There was a review comment for PR#197 to remove 4 from artifactid.
"aherbert <https://github.com/aherbert> on Sep 25, 2021
<https://github.com/apache/commons-math/pull/197#discussion_r716075956>

Remove the 4 from math4. The version is specified separately from the
artifact ID."

I've updated the feature branch with those changes. Please rebase.

I've not yet looked at the code, but a question arose from looking at
the dependencies: What is "jfreechart" used for in the "examples"?
-- jfreechart is used to do a graphical plot of the optimization process.

I've just updated the "k-means" example, removing the GUI along
the way.  In general, I think that the example applications should
follow the KISS principle (which here translates to:  Only write to the
console or to files).  Since we don't intend to write full-fledged
applications, building/testing should be as smooth as possible: GUIs
entail unnecessary hassle for someone working from a remote
(text) terminal.
-- I shall remove that and the corresponding part of the code.

> Please find the log below. Kindly let me know once the build is
successful.
> The command was "*mvn clean verify apache-rat:check checkstyle:check
> pmd:check spotbugs:check javadoc:javadoc*".

>As Alex noted, you should ensure that the build is successful with
>the supported version of the JDK (i.e. Java 8 currently).
>[If you encounter problems with a later version, it's always nice to
>file a JIRA report, but fixing such issues is probably low priority.]
--I have updated my JDK to version 8. The build is successful now. Thanks.


Thanks & Regards
--Avijit Basak

On Tue, 1 Feb 2022 at 05:24, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 31 janv. 2022 à 06:27, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please find my comments below.
> >
> > >There is no attachment (I think that the ML manager strips those).
> > >Please copy/paste the relevant part of the console log (or provide
> > >a link to it).
> > --The build was done locally with a fresh clone of the feature branch.
>
> Strange that the "pom.xml" in PR #204 still refers to version 1.0 of
> Commons Numbers, instead of version 1.1-SNAPSHOT.
> This creates many "NoClassDefFound" errors that were fixed with
> commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch
> "feature__MATH-1563__genetic_algorithm" branch 6 days ago.
>
> Anyways, after fetching your PR and rebasing on that branch, the
> build is successful.
>
> Nevertheless, I had to fix/consolidate many POM files that contained
> a slew of duplicate declarations (the "dependency management" is
> done at the highest possible level, to ensure version consistency).
> Also, please use the same formatting rules as in existing files (in
> POM files, the indentation is 2 spaces).
>
> And there was this old issue that the "" should contain
> the name of the top-level package, i.e. "math4", not "math".
>
> I've updated the feature branch with those changes. Please rebase.
>
> I've not yet looked at the code, but a question arose from looking at
> the dependencies: What is "jfreechart" used for in the "examples"?
> I've just updated the "k-means" example, removing the GUI along
> the way.  In general, I think that the example applications should
> follow the KISS principle (which here translates to:  Only write to the
> console or to files).  Since we don't intend to write full-fledged
> applications, building/testing should be as smooth as possible: GUIs
> entail unnecessary hassle for someon

Re: [MATH][GA] Build Failure for PR #204

2022-01-30 Thread Avijit Basak
math4\legacy\ode\nonstiff\GraggBulirschStoerIntegrator.java:65:
error: attribute not supported in HTML5: cellpadding
[ERROR]  * 
[ERROR]  ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerIntegrator.java:65:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR]
^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\sampling\FieldStepNormalizer.java:45:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR] ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\sampling\StepNormalizer.java:43:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR] ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\stat\ranking\NaturalRanking.java:44:
error: attribute not supported in HTML5: cellpadding
[ERROR]  * 
[ERROR]  ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\stat\ranking\NaturalRanking.java:44:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR]  ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\package-info.java:130:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR] ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\package-info.java:141:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR] ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45:
error: attribute border for table only accepts "" or "1", use CSS instead:
BORDER
[ERROR]  * 
[ERROR]   ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45:
error: attribute not supported in HTML5: width
[ERROR]  * 
[ERROR]  ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45:
error: attribute not supported in HTML5: cellpadding
[ERROR]  * 
[ERROR]  ^
[ERROR]
C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45:
error: attribute not supported in HTML5: summary
[ERROR]  * 
[ERROR]
 ^
[ERROR]
[ERROR] Command line was: cmd.exe /X /C ""C:\Program
Files\jdk-11.0.12\bin\javadoc.exe" @options @packages"
[ERROR]
[ERROR] Refer to the generated Javadoc files in
'C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\target\site\apidocs'
dir.
[ERROR]
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the
command
[ERROR]   mvn  -rf :commons-math4-legacy


>P.S. I'll stop trying to "rebase" the feature branch on the current
>state of "master" because I stumble on the same conflict every
>time (on the "pom.xml" file)...
--This will be helpful to accelerate the further development of the GA
module. The rebase and merging process is consuming lots of additional time.


Thanks & Regards
--Avijit Basak

On Sun, 30 Jan 2022 at 21:27, Gilles Sadowski  wrote:

> Hello.
>
> Le dim. 30 janv. 2022 à 13:43, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  I have taken a fresh clone of the feature branch
> "feature__MATH-1563__genetic_algorithm" from apache's repository and
> executed the build. The build failed without my changes added

Re: [MATH][GA] Build Failure for PR #204

2022-01-30 Thread Avijit Basak
Hi All

 I have taken a fresh clone of the feature branch
"feature__MATH-1563__genetic_algorithm" from apache's repository and
executed the build. The build failed without my changes added to it. The
summary report is attached herewith. Kindly look into it and do the needful.

Thanks & Regards
--Avijit Basak

On Tue, 25 Jan 2022 at 19:02, Gilles Sadowski  wrote:

> Hello.
>
> I just did "git push" (no "force" this time) on the feature branch.
> The problem arose from changes applied a few hours ago in
> "master" and not merge into the other branch yet. Sorry; please
> rebase on the latest update.
>
> Regards,
> Gilles
>
> Le mar. 25 janv. 2022 à 12:31, Alex Herbert  a
> écrit :
> >
> > On Tue, 25 Jan 2022 at 05:43, Avijit Basak 
> wrote:
> > >
> > > Hi All
> > >
> > >I have missed the build report URL in my previous mail. Please
> find
> > > the same here.
> > >
> > >
> https://app.travis-ci.com/github/apache/commons-math/builds/245277914
> >
> > The version of Commons Numbers in the parent pom is the released
> > version 1.0. However some of the legacy classes are now using new code
> > added to the gamma package in the unreleased version 1.1-SNAPSHOT.
> >
> > The master branch is correctly using the 1.1-SNAPSHOT. So somewhere
> > the feature branch feature__MATH-1563__genetic_algorithm has not been
> > kept totally in sync with master.
> >
> > I can rebase the feature branch on master to correct this. But the
> > result would require a force push and anyone else using this branch
> > would have to reset their local copy.
> >
> > Or I can merge master into the feature branch which creates annoying
> > merge commits in the history and the commit logs for the 1563 feature
> > are interspersed with all the other commits performed on master while
> > development was underway (i.e all commits are in date order
> > irrespective of the branch they occurred on).
> >
> > I believe last time Gilles resolved a lot of the repeat commits using
> > a force push. If this is only being used as a work-in-progress (WIP)
> > by one developer then I do not see a need to avoid force push. This
> > would keep all the commits for the feature together in the git history
> > when it is eventually merged to master.
> >
> > Gilles, do you have any opinion on how to manage the WIP feature branch.
> >
> > Alex
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [MATH][GA] Build Failure for PR #204

2022-01-24 Thread Avijit Basak
Hi All

   I have missed the build report URL in my previous mail. Please find
the same here.

   https://app.travis-ci.com/github/apache/commons-math/builds/245277914


Thanks & Regards
--Avijit Basak

On Tue, 25 Jan 2022 at 11:10, Avijit Basak  wrote:

> Hi All
>
> I have created a new PR(#*204*) to check in changes related to
> "commons-math-ga"  and "examples-ga" modules. The changes need to be merged
> to feature branch "feature__MATH-1563__genetic_algorithm". Unfortunately
> the build failed in the legacy module which is updated with changes from
> the feature branch "feature__MATH-1563__genetic_algorithm". I have no
> changes in legacy module. I have executed maven checkstyle, PMD and spotbug
> checks in "commons-math-ga" and "examples-ga" modules which passed
> successfully.
> Could anyone help me to resolve this.
>
> Thanks & Regards
> -- Avijit Basak
>


[MATH][GA] Build Failure for PR #204

2022-01-24 Thread Avijit Basak
Hi All

I have created a new PR(#*204*) to check in changes related to
"commons-math-ga"  and "examples-ga" modules. The changes need to be merged
to feature branch "feature__MATH-1563__genetic_algorithm". Unfortunately
the build failed in the legacy module which is updated with changes from
the feature branch "feature__MATH-1563__genetic_algorithm". I have no
changes in legacy module. I have executed maven checkstyle, PMD and spotbug
checks in "commons-math-ga" and "examples-ga" modules which passed
successfully.
    Could anyone help me to resolve this.

Thanks & Regards
-- Avijit Basak


Re: [Math] Please review GA implementation

2022-01-24 Thread Avijit Basak
Hi All

I have taken all changes from the feature branch and put my code
for commons-math-ga and examples-ga modules. A new PR(#204) is also created
and the previous PR(#203) is closed.

Thanks & Regards
--Avijit Basak

On Sat, 22 Jan 2022 at 23:02, Gilles Sadowski  wrote:

> Le sam. 22 janv. 2022 à 15:30, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> > >Please be sure to use my latest (forced) update.
> > >[I removed the many duplicate commits I had introduced by mistake.]
> >
> > --Is the change only in commons-math-examples/pom.xml file? Please
> confirm.
>
> No, there can be other commits, as I try to keep this branch
> up-to-date with "master".
>
> > My changes are creating conflict after this push. I shall create a new
> PR.
>
> Thanks,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [Math] Please review GA implementation

2022-01-22 Thread Avijit Basak
Hi

>Please be sure to use my latest (forced) update.
>[I removed the many duplicate commits I had introduced by mistake.]

--Is the change only in commons-math-examples/pom.xml file? Please confirm.
My changes are creating conflict after this push. I shall create a new PR.

Thanks & Regards
--Avijit Basak

On Fri, 21 Jan 2022 at 06:15, Gilles Sadowski  wrote:

> Hi.
>
> Le jeu. 20 janv. 2022 à 17:58, Gilles Sadowski  a
> écrit :
> >
> > Hello.
> >
> > Le jeu. 20 janv. 2022 à 13:09, Avijit Basak  a
> écrit :
> > >
> > > Hi All
> > >
> > >I have restructured the examples-ga module to fix a few issues
> and
> > > created separate child modules for each type of usages. There was also
> a
> > > minor bug related to the logger class name in commons-ga module, which
> I
> > > fixed.  A new PR(*#203*) has been created.
> >
> > Did you ensure that it is up-to-date with the upstream branch?
>
> Please be sure to use my latest (forced) update.
> [I removed the many duplicate commits I had introduced by mistake.]
>
> >
> > Gilles
> >
> > > [...]
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Math] Please review GA implementation

2022-01-20 Thread Avijit Basak
Hi All

   I have restructured the examples-ga module to fix a few issues and
created separate child modules for each type of usages. There was also a
minor bug related to the logger class name in commons-ga module, which I
fixed.  A new PR(*#203*) has been created. Kindly review the same and let
me know if you see any issues with it.

https://github.com/apache/commons-math/pull/203

Thanks & Regards
--Avijit Basak

On Wed, 12 Jan 2022 at 20:55, Avijit Basak  wrote:

> Hi All
>
> I have lost track of the jar creation process in the examples
> module. Everytime before commiting I have executed examples using Eclipse
> IDE which ran successfully. I need to modify the examples module. Sorry for
> any inconvenience. Please find my additional responses below.
>
> >There are issues with the expected functionality of the "examples-ga"
> module.
>
> >Assuming that the following command has been issued
> >  $ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ mvn package
> >and has completed successfully, executable JAR files should have been
> >created under the "target" directories.
>
> >For example, issuing this command (example for the "neuralnet" module):
> > $ java -jar
> commons-math-examples/examples-sofm/tsp/target/examples-sofm-tsp.jar
> >outputs
> >---CUT---
> >Missing required option '-o=outputFile'
> >Usage:  [-hV] [-j=numJobs] [-m=maxTrials] [-n=neuronsPerCity]
> >   -o=outputFile [-s=numSamples]
> >Run the application
> >  -h, --help   Show this help message and exit.
> >  -j=numJobs   Number of concurrent tasks (default: 8).
> >  -m=maxTrials Maximal number of trials (default: 10).
> >  -n=neuronsPerCityAverage number of neurons per city (default: 2.2).
> >  -o=outputFileOutput file name.
> >  -s=numSamplesNumber of samples for the training (default: 2000).
> >  -V, --versionPrint version information and exit.
> >---CUT---
>
> >The above thus shows that the program runs as expected (passing the
> missing
> >required argument produces the expected output file).
>
> >Doing the equivalent for the new examples, e.g.
> >  $ java -jar
> commons-math-examples/examples-ga/examples-ga-tsp/target/examples-ga-mathfunctions.jar
> >results in
> >---CUT---
> >Exception in thread "main" java.lang.UnsatisfiedLinkError: Can't load
> >library: /usr/lib/jvm/java-11-openjdk-amd64/lib/libawt_xawt.so
> >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2630)
> >at java.base/java.lang.Runtime.load0(Runtime.java:768)
> >at java.base/java.lang.System.load(System.java:1837)
> >at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
> >at
> java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442)
> >at
> java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498)
> >at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694)
> >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2648)
> >at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
> >at java.base/java.lang.System.loadLibrary(System.java:1873)
> >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1399)
> >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1397)
> >at java.base/java.security.AccessController.doPrivileged(Native
> Method)
> >at java.desktop/java.awt.Toolkit.loadLibraries(Toolkit.java:1396)
> >at java.desktop/java.awt.Toolkit.(Toolkit.java:1429)
> >at java.desktop/java.awt.Component.(Component.java:621)
> >at
> org.apache.commons.math4.examples.ga.tsp.TSPOptimizer.main(TSPOptimizer.java:62)
> >---CUT---
>
> >[Please note that the name of the JAR also looks wrong (a copy/paste
> mistake?).]
>
> --There is an issue in the jar file name. But in my system the TSP
> application executed successfully. The commands I have executed are given
> below:
> $ mvn package
> $ java -jar examples-ga-mathfunctions.jar
> JDK version in my local system is "1.8.0_301"
>
> --In the mvn command specified by you JAVA_HOME is assigned as
> "/usr/lib/jvm/java-8-openjdk-amd64/" but during execution of jar it is
> using java-11 Could you please confirm what is the JDK version used and
> ensure same version is used for both. This issue usually comes if the JDK
> is not properly installed.
>
> >Command
> >  $ java -jar
> commons-math-examples/examples-ga/examples-ga-math-functions/target/examples-ga-mathfunctions.jar
> >also fails, with the following error
> >-

Re: [Math] Please review GA implementation

2022-01-12 Thread Avijit Basak
Hi All

I have lost track of the jar creation process in the examples
module. Everytime before commiting I have executed examples using Eclipse
IDE which ran successfully. I need to modify the examples module. Sorry for
any inconvenience. Please find my additional responses below.

>There are issues with the expected functionality of the "examples-ga"
module.

>Assuming that the following command has been issued
>  $ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ mvn package
>and has completed successfully, executable JAR files should have been
>created under the "target" directories.

>For example, issuing this command (example for the "neuralnet" module):
> $ java -jar
commons-math-examples/examples-sofm/tsp/target/examples-sofm-tsp.jar
>outputs
>---CUT---
>Missing required option '-o=outputFile'
>Usage:  [-hV] [-j=numJobs] [-m=maxTrials] [-n=neuronsPerCity]
>   -o=outputFile [-s=numSamples]
>Run the application
>  -h, --help   Show this help message and exit.
>  -j=numJobs   Number of concurrent tasks (default: 8).
>  -m=maxTrials Maximal number of trials (default: 10).
>  -n=neuronsPerCityAverage number of neurons per city (default: 2.2).
>  -o=outputFileOutput file name.
>  -s=numSamplesNumber of samples for the training (default: 2000).
>  -V, --versionPrint version information and exit.
>---CUT---

>The above thus shows that the program runs as expected (passing the missing
>required argument produces the expected output file).

>Doing the equivalent for the new examples, e.g.
>  $ java -jar
commons-math-examples/examples-ga/examples-ga-tsp/target/examples-ga-mathfunctions.jar
>results in
>---CUT---
>Exception in thread "main" java.lang.UnsatisfiedLinkError: Can't load
>library: /usr/lib/jvm/java-11-openjdk-amd64/lib/libawt_xawt.so
>at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2630)
>at java.base/java.lang.Runtime.load0(Runtime.java:768)
>at java.base/java.lang.System.load(System.java:1837)
>at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
>at
java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442)
>at
java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498)
>at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694)
>at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2648)
>at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830)
>at java.base/java.lang.System.loadLibrary(System.java:1873)
>at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1399)
>at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1397)
>at java.base/java.security.AccessController.doPrivileged(Native Method)
>at java.desktop/java.awt.Toolkit.loadLibraries(Toolkit.java:1396)
>at java.desktop/java.awt.Toolkit.(Toolkit.java:1429)
>at java.desktop/java.awt.Component.(Component.java:621)
>at
org.apache.commons.math4.examples.ga.tsp.TSPOptimizer.main(TSPOptimizer.java:62)
>---CUT---

>[Please note that the name of the JAR also looks wrong (a copy/paste
mistake?).]

--There is an issue in the jar file name. But in my system the TSP
application executed successfully. The commands I have executed are given
below:
$ mvn package
$ java -jar examples-ga-mathfunctions.jar
JDK version in my local system is "1.8.0_301"

--In the mvn command specified by you JAVA_HOME is assigned as
"/usr/lib/jvm/java-8-openjdk-amd64/" but during execution of jar it is
using java-11 Could you please confirm what is the JDK version used and
ensure same version is used for both. This issue usually comes if the JDK
is not properly installed.

>Command
>  $ java -jar
commons-math-examples/examples-ga/examples-ga-math-functions/target/examples-ga-mathfunctions.jar
>also fails, with the following error
>---CUT---
>Error: Could not find or load main class
>org.apache.commons.math4.examples.ga.mathfunctions.Dimension2FunctionOptimizer
>Caused by: java.lang.ClassNotFoundException:
>org.apache.commons.math4.examples.ga.mathfunctions.Dimension2FunctionOptimizer
>---CUT---
-- This is due to the wrong package name. I introduced a sub package based
on dimension but forgot to modify the same in the pom file.

>I noticed that there is an example relating to "Dimension2" and another
>to "DimensionN".  Isn't the former, in principle, a special case of the
latter?
--Yes, the former is the special case of the latter. But the way the
executable jar is generated I need to keep a single java file with the main
method and the number of dimensions needs to be passed as a program
argument.
--I shall make the changes and create a PR for the new feature branch.


Thanks & Reg

Re: [Math] Please review GA implementation

2022-01-08 Thread Avijit Basak
Hi All

I would like to add a few words over here. The JIRA was initially
created as a proposal to accommodate rank based adaptive probability
generation approaches for GA operators like crossover and mutation etc
following the referred article. The article mainly describes the adaptive
probability generation strategy and compares the two approaches for the
same. It does not describe the entire work done in this library. However,
during the design phase a few more change requirements were detected to
make the library more robust and effective and are not described in the
PDF. In order to have a good overview of changes, kindly review the
sub-tasks created as part of the issue(MATH-1563). While writing the
article I have used the legacy model as the Simple GA implementation. So
the results described in the PDF for simple GA would differ from that of
the current implementation. The primary reason for this change is
calculation of mutation probability at the allele level instead of
chromosome level like the legacy model. This has improved the optimization
result to a considerable extent even for Simple GA. I have tried to
describe all details and reasons for changes in the sub-task descriptions.
Kindly let me know if any further clarification is required.

Thanks & Regards
--Avijit Basak

On Sat, 8 Jan 2022 at 16:17, Bruno P. Kinoshita
 wrote:

>  Reviewed about 1/4 of the PR, but it was mainly about serialization
> (started from the bottom, comparing using GitHub UI [1]). But that code and
> tests were looking OK.
>
>
> Will try to go over a few more files, but I also found a PDF in the issue
> that I think I will try to read first, to have a better idea of the change.
> I haven't read/used anything related to genetic algorithms since university.
>
> Cheers
> Bruno
>
>
>
> [1]
> https://github.com/apache/commons-math/compare/master..feature__MATH-1563__genetic_algorithm
>
> On Tuesday, 4 January 2022, 08:57:15 am NZDT, Gilles Sadowski <
> gillese...@gmail.com> wrote:
>
>  Hello.
>
> I've just created a "feature__MATH-1563__genetic_algorithm" branch[1]
> in the git repository, with the code provided by Avijit Basak in PR #200,
> a proposed replacement of the "o.a.c.math4.legacy.genetics" package.[2]
> Reviews welcome.
>
> Regards,
> Gilles
>
> [1]
> https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/feature__MATH-1563__genetic_algorithm
> [2] https://issues.apache.org/jira/browse/MATH-1563
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>



-- 
Avijit Basak


Re: [MATH] Build Failure

2022-01-05 Thread Avijit Basak
Hi All

   I have identified a few *protected* methods which could be made
*private*. Those are mostly validation methods of input arguments and used
internally. Keeping them as protected won't add much value considering
future extension. I would like to do the modification.
   It would be helpful if anyone can confirm the process of checking in
new code now. Will it be as part of the same PR(#200) with a new commit?
Would the commit message remain the same as earlier?

Thanks & Regards
--Avijit Basak

On Sun, 2 Jan 2022 at 19:05, Avijit Basak  wrote:

> Hi All
>
> I have created a new *PR*(*#200*) with all changes under a single
> commit message. Kindly review the same and let me know if any further
> change is required.
>
> Thanks & Regards
> --Avijit Basak
>
> On Mon, 27 Dec 2021 at 23:31, Gilles Sadowski 
> wrote:
>
>> Hello.
>>
>> Le lun. 27 déc. 2021 à 16:02, Avijit Basak  a
>> écrit :
>> >
>> > Hi All
>> >
>> > Please ignore my previous mail. The rebase is done successfully.
>> > Please let me know if there is any issue.
>>
>> Here is a the list of commit messages that are should not be
>> present (at least not when introducing completely new code):
>>
>> Merge branch 'feature/MATH-1563-ADAPTIVE' of
>> https://github.com/avijitbasak/commons-math.git into
>> feature/MATH-1563-ADAPTIVE
>> removed 64 by Long.SIZE
>> Merge branch 'master' of https://github.com/apache/commons-math.git
>> into feature/MATH-1563-ADAPTIVE
>> Minor change for UniformRandomProvider
>> modified as per PMD recommendations
>> updated for checkstyle formatting
>> An optimized data structure implementation for binary chromosome
>> minor modifications
>> Modifications as per review comments
>> Developed the new genetic algorithm module following the JIRA MATH-1563.
>>
>> What I suggested is to check out a pristine copy of "master", and copy
>> the new files onto it, and only change whatever needs to be touched for
>> the new contents to be handled correctly (i.e. just the POM files I
>> guess).
>>
>> Then generate a _new_ PR (and close #199).
>> There should be a _single_ commit with a log message of the form:
>> ---CUT---
>> MATH-1563: Introducing new implementation of GA functionality (WIP).
>> ---CUT---
>>
>> [If you don't want to give more details about all the changes, please
>> stick to the above sentence.  Note that the convention is that the
>> issue identifier be followed by a colon; then, as the commit log
>> summary, a single sentence, ending with a period, on the first line.]
>>
>> Thanks,
>> Gilles
>>
>> >
>> > Thanks & Regards
>> > --Avijit Basak
>> >
>> > On Mon, 27 Dec 2021 at 19:21, Avijit Basak 
>> wrote:
>> >
>> > > Hi All
>> > >
>> > > I have tried to rebase. However I found too many conflicts and
>> > > most of them are unnecessary. So I aborted the process. Can we avoid
>> the
>> > > rebase as we have very few commits after the last rebase. Please
>> share your
>> > > views on this.
>> > >
>> > > Thanks & Regards
>> > > --Avijit Basak
>> > >
>> > > On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski 
>> > > wrote:
>> > >
>> > >> Hello.
>> > >>
>> > >> I've fetched the current contents of PR 199; locally, the build
>> completes
>> > >> successfully, so the problem reported by Travis looks strange indeed.
>> > >> I would create a branch for further discussion on your GA design but
>> > >> please first create a *single* commit that contains all changes wrt
>> to
>> > >> current "master" with a clear log message (first word *must* be the
>> JIRA
>> > >> identifier of your proposal (perhaps a new JIRA report would be
>> clearer?),
>> > >> like:
>> > >> ---CUT---
>> > >> MATH-: Refactoring of GA functionality (WIP)
>> > >>
>> > >> Summary of what has been implemented (with the corresponding JIRA
>> > >> reports)...
>> > >>
>> > >> (optionally) Summary what is under discussion...
>> > >> ---CUT---
>> > >>
>> > >> Thanks,
>> > >> Gilles
>> > >>
>> > >> >> [...]
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
> --
> Avijit Basak
>


-- 
Avijit Basak


Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2022-01-03 Thread Avijit Basak
rs, people are welcome to contribute back if
> >something they need is missing.
> -- I think we have a disconnect here too. If the framework is not
> extensible how users can use this in their problem domain. If this is not
> extensible then it would never be used. How can we get back the
> contribution?

>I answered to this above.

>
> >Your argument of "too much diversity" can be reversed, in that
> >it is unlikely that one library would attract everyone that needs a
> >genetic algorithm.
> -- Even if it cannot attract everyone with out of box features it should
be
> extensible for those.

>I don't agree with making things more complicated for us, now and
>in the foreseeable future, in order to satisfy users who don't exist yet
>(because the library does not exist yet).
-- I don't want to make things complicated for us. GA has a huge amount of
usages in diverse fields. Of course we should not try to provide solutions
for all. But the only thing I would like to ensure is that this library
should be reusable so that anyone can extend it and design solutions for
a new domain. We should not put any burden towards this.

>Let's focus on making it work within a given scope, and then we can
>think of improvements (that will be easy if the design is "structurally"
>extensible, even if they are somehow "disabled" in the first release).
-- I am against this "disable" option. I have tried to search the list of
use cases for GA and found this huge list
https://en.wikipedia.org/wiki/List_of_genetic_algorithm_applications
My proposal is we should allow extensibility selectively with immutability
in place. This won't create any bugs in our code due to extension.

> >Better make a design that can handle a fraction of use cases,
> >and grow as needed.
> --There are already libraries which can solve most common use cases.
> Non-extensible nature would block the growth to a considerable extent.

>Is there a misunderstanding about what is implied by "extensible"?
>Question: Are all classes, in your current design, "immutable"?
-- Yes, they are mostly. However, there are some classes with
protected/public methods which mutate private fields for internal
processing e.g. generationsEvolved field in AbstractGeneticAlgorithm class.
However the child classes cannot modify those private fields as there are
no direct mutation methods.

>If so, that's an excellent basis, and we should stop discussing the
>meaning of "extensibility".
--I think the design first needs a review. Then we can reinitiate this
discussion.

>
> >> >Extending the functionality, if necessary, should be contributed back
> here
> >> *-- *Sometimes the GA operators are very much specific to the domain
and
> >> it's hard to generalise. In those scenarios contributing back to the
> >> library might not be possible.
>
> >In such a case, how likely will it also be that whatever general
> >framework this library has put in place, will also not be amenable
> >to that domain's specifics?
> -- Could you please frame this concern w.r.t. the scheduling example
> provided above.

?

>
> >There is always a scope from which design decisions must be taken.
> >If "multi-threading" is in the scope, then the design must avoid
> >inheritance (in public classes) in order to much more easily
> >ensure the correctness of applications.
> -- Immutable design can also take care of multi-threading.

>My main point in the discussion is that all classes with "public" access
>should be immutable, indeed.
-- They should be.

>
> >> However, if a library cannot be extended for
> >> a new domain by users it becomes underutilised over time if not
useless.
>
> >Sure but that is a hypothetical for the long-term.
> >However, if the library is buggy or slow, it will not be used at all.
> -- Is there any benchmark for speed/performance? GA is always infamous for
> resource consumption rather than time.

>I'm not sure I understand what you mean here.


Thanks & Regards
--Avijit Basak

On Thu, 23 Dec 2021 at 20:50, Gilles Sadowski  wrote:

> Hello.
>
> Le jeu. 23 déc. 2021 à 14:22, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  Please see my comments below.
> >
> > >As I've already indicated, "ThreadLocalRandomSource" is, IMHO, a
> > >sort of workaround for a multi-thread application that does not want
> > >to bother managing per-thread RNG instance(s).
> > -- I am not clear on this. ThreadLocalRandomSource maintains
> > an EnumMap>. What is
> meant
> > by it "does not want to bother managing per-thread RNG instance(s)" Could
> > you please elab

Re: [MATH] Build Failure

2022-01-02 Thread Avijit Basak
Hi All

I have created a new *PR*(*#200*) with all changes under a single
commit message. Kindly review the same and let me know if any further
change is required.

Thanks & Regards
--Avijit Basak

On Mon, 27 Dec 2021 at 23:31, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 27 déc. 2021 à 16:02, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please ignore my previous mail. The rebase is done successfully.
> > Please let me know if there is any issue.
>
> Here is a the list of commit messages that are should not be
> present (at least not when introducing completely new code):
>
> Merge branch 'feature/MATH-1563-ADAPTIVE' of
> https://github.com/avijitbasak/commons-math.git into
> feature/MATH-1563-ADAPTIVE
> removed 64 by Long.SIZE
> Merge branch 'master' of https://github.com/apache/commons-math.git
> into feature/MATH-1563-ADAPTIVE
> Minor change for UniformRandomProvider
> modified as per PMD recommendations
> updated for checkstyle formatting
> An optimized data structure implementation for binary chromosome
> minor modifications
> Modifications as per review comments
> Developed the new genetic algorithm module following the JIRA MATH-1563.
>
> What I suggested is to check out a pristine copy of "master", and copy
> the new files onto it, and only change whatever needs to be touched for
> the new contents to be handled correctly (i.e. just the POM files I guess).
>
> Then generate a _new_ PR (and close #199).
> There should be a _single_ commit with a log message of the form:
> ---CUT---
> MATH-1563: Introducing new implementation of GA functionality (WIP).
> ---CUT---
>
> [If you don't want to give more details about all the changes, please
> stick to the above sentence.  Note that the convention is that the
> issue identifier be followed by a colon; then, as the commit log
> summary, a single sentence, ending with a period, on the first line.]
>
> Thanks,
> Gilles
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> > On Mon, 27 Dec 2021 at 19:21, Avijit Basak 
> wrote:
> >
> > > Hi All
> > >
> > > I have tried to rebase. However I found too many conflicts and
> > > most of them are unnecessary. So I aborted the process. Can we avoid
> the
> > > rebase as we have very few commits after the last rebase. Please share
> your
> > > views on this.
> > >
> > > Thanks & Regards
> > > --Avijit Basak
> > >
> > > On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski 
> > > wrote:
> > >
> > >> Hello.
> > >>
> > >> I've fetched the current contents of PR 199; locally, the build
> completes
> > >> successfully, so the problem reported by Travis looks strange indeed.
> > >> I would create a branch for further discussion on your GA design but
> > >> please first create a *single* commit that contains all changes wrt to
> > >> current "master" with a clear log message (first word *must* be the
> JIRA
> > >> identifier of your proposal (perhaps a new JIRA report would be
> clearer?),
> > >> like:
> > >> ---CUT---
> > >> MATH-: Refactoring of GA functionality (WIP)
> > >>
> > >> Summary of what has been implemented (with the corresponding JIRA
> > >> reports)...
> > >>
> > >> (optionally) Summary what is under discussion...
> > >> ---CUT---
> > >>
> > >> Thanks,
> > >> Gilles
> > >>
> > >> >> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH] Build Failure

2021-12-27 Thread Avijit Basak
Hi All

Please ignore my previous mail. The rebase is done successfully.
Please let me know if there is any issue.

Thanks & Regards
--Avijit Basak

On Mon, 27 Dec 2021 at 19:21, Avijit Basak  wrote:

> Hi All
>
> I have tried to rebase. However I found too many conflicts and
> most of them are unnecessary. So I aborted the process. Can we avoid the
> rebase as we have very few commits after the last rebase. Please share your
> views on this.
>
> Thanks & Regards
> --Avijit Basak
>
> On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski 
> wrote:
>
>> Hello.
>>
>> I've fetched the current contents of PR 199; locally, the build completes
>> successfully, so the problem reported by Travis looks strange indeed.
>> I would create a branch for further discussion on your GA design but
>> please first create a *single* commit that contains all changes wrt to
>> current "master" with a clear log message (first word *must* be the JIRA
>> identifier of your proposal (perhaps a new JIRA report would be clearer?),
>> like:
>> ---CUT---
>> MATH-: Refactoring of GA functionality (WIP)
>>
>> Summary of what has been implemented (with the corresponding JIRA
>> reports)...
>>
>> (optionally) Summary what is under discussion...
>> ---CUT---
>>
>> Thanks,
>> Gilles
>>
>> >> [...]
>>
>> ---------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
> --
> Avijit Basak
>


-- 
Avijit Basak


Re: [MATH] Build Failure

2021-12-27 Thread Avijit Basak
Hi All

I have tried to rebase. However I found too many conflicts and most
of them are unnecessary. So I aborted the process. Can we avoid the rebase
as we have very few commits after the last rebase. Please share your views
on this.

Thanks & Regards
--Avijit Basak

On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski  wrote:

> Hello.
>
> I've fetched the current contents of PR 199; locally, the build completes
> successfully, so the problem reported by Travis looks strange indeed.
> I would create a branch for further discussion on your GA design but
> please first create a *single* commit that contains all changes wrt to
> current "master" with a clear log message (first word *must* be the JIRA
> identifier of your proposal (perhaps a new JIRA report would be clearer?),
> like:
> ---CUT---
> MATH-: Refactoring of GA functionality (WIP)
>
> Summary of what has been implemented (with the corresponding JIRA
> reports)...
>
> (optionally) Summary what is under discussion...
> ---CUT---
>
> Thanks,
> Gilles
>
> >> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH] Build Failure

2021-12-24 Thread Avijit Basak
Hi

   I have no access to change anything in the master branch. The
artifact id for the module "commons-math-core" is mentioned as
"commons-math4-core" in the pom file present in the master branch. Please
check the following URL.
https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml
   Also please let me know if you see any other issues.

Thanks & Regards
--Avijit Basak

On Fri, 24 Dec 2021 at 17:27, Gilles Sadowski  wrote:

> Le ven. 24 déc. 2021 à 12:23, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >   I have initiated the build once again after pulling changes
> from
> > the master branch. However, the build has failed again. Kindly look into
> > the report.
> >  https://app.travis-ci.com/github/apache/commons-math/builds/243963791
> >
> >   The artifact id "commons-math4-core" is present in the master
> > branch of the repository
> >
> >
> https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml
> > .
> >   Am I missing anything?
>
> That
> commons-math-core
> is not the same as
> commons-math4-core
>
> [IIRC, you create the latter by mistake.]
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> >
> >>> [...]
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH] Build Failure

2021-12-24 Thread Avijit Basak
Hi

  I have initiated the build once again after pulling changes from
the master branch. However, the build has failed again. Kindly look into
the report.
 https://app.travis-ci.com/github/apache/commons-math/builds/243963791

  The artifact id "commons-math4-core" is present in the master
branch of the repository

https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml
.
  Am I missing anything?

Thanks & Regards
--Avijit Basak


On Thu, 23 Dec 2021 at 18:34, Gilles Sadowski  wrote:

> Hi.
>
> Le mer. 22 déc. 2021 à 17:31, Gilles Sadowski  a
> écrit :
> >
> > Hello.
> >
> > Le mer. 22 déc. 2021 à 15:05, Avijit Basak  a
> écrit :
> > >
> > > Hi All
> > >
> > >  I am facing a build issue for PR #199 in commons-math library.
>
> After you fix the build (cf. below), I'll create a branch dedicated
> to GA development, in order to clarify the issues we've been
> discussing.
>
> Regards,
> Gilles
>
> >
> > Next time, please provide a direct link to the build log.  Thanks.
> >
> > The last Travis build is here:
> >   https://app.travis-ci.com/github/apache/commons-math/builds/240925186
> >
> > AFAICT, the complaints are (starting at line 8044):
> > ---CUT---
> > [WARNING] Rule violated for bundle commons-math4-core
> > ---CUT---
> >
> > However, there is no "commons-math4-core" in the "master" branch:
> >https://github.com/apache/commons-math/
> >
> > Regards,
> > Gilles
> >
> > > The
> > > report summary is given below. Can anyone kindly look into the issue
> and do
> > > the needful.
> > >
> > > [ [1;34mINFO [m] Apache Commons Math 
> > > [1;32mSUCCESS [m [  8.501 s]*[ [1;34mINFO [m] Miscellaneous core
> > > classes .  [1;31mFAILURE [m [ 24.551 s]*
> > > [ [1;34mINFO [m] Artificial neural networks .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Transforms .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Exception classes (Legacy) .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Miscellaneous core classes (Legacy) 
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Apache Commons Math (Legacy) ...
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Example applications ...
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] SOFM ...
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Chinese Rings ..
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] Traveling Salesman Problem .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] genetic algorithm ..
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] examples-genetic-algorithm .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] examples-ga-math-functions .
> > > [1;33mSKIPPED [m
> > > [ [1;34mINFO [m] examples-ga-tsp 
> > > [1;33mSKIPPED [m
> > >
> > >
> > > Thanks & Regards
> > > -- Avijit Basak
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-12-23 Thread Avijit Basak
, only configuration
> >of the range.
> *-- *I agree. But the question is should we block the extension.

>Please find a valid use case. ;-)
-- Recently I did an implementation of scheduling with commons-math 3.6. I
have implemented the chromosome representing schedule by extending
AbstractListChromosome. The mutation was also customized according to the
requirement. However, I was able to use the existing OnePointCrossover
operator. Do you think this kind of implementation would be possible if the
framework does not support extensibility?

>
> >> I have initially implemented
> >> the Binary chromosome and the corresponding binary mutation following
the
> >> same pattern. However, restricting extension of concrete classes by
> private
> >> constructor does not prevent users from extending the abstract parent
> >> classes.
>
> >We should aim at coding the GA logic through (Java) interfaces, and not
> >expose the "abstract" classes.
> *-- *One of the primary reasons for me to contribute in Apache' GA library
> is it's simplicity and extensibility.

>"Extensibility" does not necessarily imply "inheritance"-based.
-- Can you provide a solution to the above problem without an extensibility
feature?

>In fact, we do want to *avoid* in order to more easily and more robustly
>provide other advantages such as multi-threading.
-- IMHO immutable operator design is the best choice for supporting
multi-threading. It is much easier to implement even for user extension.
Why don't we think of fixing the ThreadLocalRandomSource.

>> I would like to have a framework
>> which should be always extensible for any problem domain with minor
>> changes.

>Any problem domain should indeed be amenable to be solved
>by the library; I don't see how that should imply a design based
>on inheritance.
-- Do you have any alter design in mind. Kindly share the same.

>> The primary reason behind this is that application domains of GA
>> are too diverse. It is not possible to implement everything in a library.
>> We don't know all possible domain areas too. If we remove the
extensibility
>> from the framework it would be useless in lots of areas.

>When that occurs, people are welcome to contribute back if
>something they need is missing.
-- I think we have a disconnect here too. If the framework is not
extensible how users can use this in their problem domain. If this is not
extensible then it would never be used. How can we get back the
contribution?

>Your argument of "too much diversity" can be reversed, in that
>it is unlikely that one library would attract everyone that needs a
>genetic algorithm.
-- Even if it cannot attract everyone with out of box features it should be
extensible for those.

>Better make a design that can handle a fraction of use cases,
>and grow as needed.
--There are already libraries which can solve most common use cases.
Non-extensible nature would block the growth to a considerable extent.

>> >Extending the functionality, if necessary, should be contributed back
here
>> *-- *Sometimes the GA operators are very much specific to the domain and
>> it's hard to generalise. In those scenarios contributing back to the
>> library might not be possible.

>In such a case, how likely will it also be that whatever general
>framework this library has put in place, will also not be amenable
>to that domain's specifics?
-- Could you please frame this concern w.r.t. the scheduling example
provided above.

>There is always a scope from which design decisions must be taken.
>If "multi-threading" is in the scope, then the design must avoid
>inheritance (in public classes) in order to much more easily
>ensure the correctness of applications.
-- Immutable design can also take care of multi-threading.

>> However, if a library cannot be extended for
>> a new domain by users it becomes underutilised over time if not useless.

>Sure but that is a hypothetical for the long-term.
>However, if the library is buggy or slow, it will not be used at all.
-- Is there any benchmark for speed/performance? GA is always infamous for
resource consumption rather than time.


Thanks & Regards
--Avijit Basak

On Wed, 22 Dec 2021 at 20:32, Gilles Sadowski  wrote:

> Hello.
>
> Le mer. 22 déc. 2021 à 14:25, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please see my comments below.
> >
> > >> >Several problems with this approach (raised in previous messages
> IIRC):
> > >> >1. Potential performance loss in sharing the same RNG instance.
> > >> -- As per my understanding ThreadLocalRandomSource creates separate
> > >> instances of UniformRandomProvider for e

[MATH] Build Failure

2021-12-22 Thread Avijit Basak
Hi All

 I am facing a build issue for PR #199 in commons-math library. The
report summary is given below. Can anyone kindly look into the issue and do
the needful.

[ [1;34mINFO [m] Apache Commons Math 
[1;32mSUCCESS [m [  8.501 s]*[ [1;34mINFO [m] Miscellaneous core
classes .  [1;31mFAILURE [m [ 24.551 s]*
[ [1;34mINFO [m] Artificial neural networks .
[1;33mSKIPPED [m
[ [1;34mINFO [m] Transforms .
[1;33mSKIPPED [m
[ [1;34mINFO [m] Exception classes (Legacy) .
[1;33mSKIPPED [m
[ [1;34mINFO [m] Miscellaneous core classes (Legacy) 
[1;33mSKIPPED [m
[ [1;34mINFO [m] Apache Commons Math (Legacy) ...
[1;33mSKIPPED [m
[ [1;34mINFO [m] Example applications ...
[1;33mSKIPPED [m
[ [1;34mINFO [m] SOFM ...
[1;33mSKIPPED [m
[ [1;34mINFO [m] Chinese Rings ..
[1;33mSKIPPED [m
[ [1;34mINFO [m] Traveling Salesman Problem .
[1;33mSKIPPED [m
[ [1;34mINFO [m] genetic algorithm ..
[1;33mSKIPPED [m
[ [1;34mINFO [m] examples-genetic-algorithm .
[1;33mSKIPPED [m
[ [1;34mINFO [m] examples-ga-math-functions .
[1;33mSKIPPED [m
[ [1;34mINFO [m] examples-ga-tsp 
[1;33mSKIPPED [m


Thanks & Regards
-- Avijit Basak


Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-12-22 Thread Avijit Basak
Hi All

Please see my *changed* comments below.

>> >  Mine is against using "ThreadLocalRandomSource"...
>> -- What is the wayout other than that. Please suggest.

>I think I did.
>>*--* The factory based approach would be useful only when we can have
separate copies of operators for each set of operations.
*--* *T*he factory based approach can introduce *custom* RNG, but it can
improve performance only when we can have separate copies of operators for
each set of operations which might lead to *memory issues* as explained in
previous mail.


Thanks & Regards
--Avijit Basak

On Wed, 22 Dec 2021 at 18:54, Avijit Basak  wrote:

> Hi All
>
> Please see my comments below.
>
> >> >Several problems with this approach (raised in previous messages IIRC):
> >> >1. Potential performance loss in sharing the same RNG instance.
> >> -- As per my understanding ThreadLocalRandomSource creates separate
> >> instances of UniformRandomProvider for each thread. So I am not sure
> how a
> >> UniformRandomProvider instance is being shared. Please correct me if I
> am
> >> wrong.
>
> >Within a given thread there will be *one* RNG instance; that's what I
> meant
> >by "shared".
> >Of course you are right that that instance is not shared by multiple
> threads
> >(which would be a bug).
> >The performance loss is because it will be necessary to call
> >  ThreadLocalRandomSource.current(RandomSource source)
> >for each access to the RNG (since it would be a bug to store the returned
> >value in e.g. an operator instance that would be shared among threads (as
> >you suggest below).
>
> -- I tried to do a small test on it and here are the results. Output times
> are in milliseconds. According to my understanding the performance loss is
> mostly during creation of per thread instance of UniformRandomProvider.
> --*CUT*--
> @Test
> void test() {
> int limit = 1;
> long start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 1000;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 1;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 10;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 100;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 1000;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 1;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
>
> limit = 10;
> start = System.currentTimeMillis();
> for (int i = 0; i < limit; i++) {
> ThreadLocalRandomSource.current(RandomSource.JDK);
> }
> System.out.println(System.currentTimeMillis() - start);
> }
> --*CUT*--
> --*output*--
> 363
> 1
> 2
> 4
> 6
> 28
> 244
> 2423
> --*output*--
>
> >> >2. Less/no flexibility (no user's choice of random source).
> >> -- Agreed.
> -- Do we really need this much flexibility here?
> >> >3. Error-prone (user can access/reuse the "UniformRandomProvider"
> >> instances).
> >>
> >> >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct
> but
> >> >"light" usage of random number generation in a multi-threaded
> app

Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-12-22 Thread Avijit Basak
uirement also increases with increase of dimension this
might lead to a major issue and need a thought.
So I think we have a design tradeoff here performance vs memory
consumption. I am more worried about memory as that might restrict use of
this library beyond a certain number of dimensions in some areas. However,
creating deep copy would only be possible when we strictly restrict
extension of operators which I want to avoid.

>> So even if we provide
>> the customization at the operator level we cannot avoid sharing.

>We can, and we should.
>What we probably can't avoid sharing is the instance that represents the
>population of chromosomes.
*--* In a multi-threaded optimization the chromosome instances are shared
in case the same chromosome is chosen for crossover by the selection
process. I missed this point earlier.
...

>> >  Mine is against using "ThreadLocalRandomSource"...
>> -- What is the wayout other than that. Please suggest.

>I think I did.
*--* The factory based approach would be useful only when we can have
separate copies of operators for each set of operations.

>Maybe it's time to create a dedicated branch for the GA functionality
>so that we can try out the different approaches.


>
> >> I think first we need to decide on whether we really need this
> >> customization and if yes then why. Then we can decide on alternate
> >> implementation options.
> >
> >> >As per the recent updates of the math-related code bases, the
> >> >public API should provide factory methods (constructors should
> >> >be private).
> >> -- private constructors will make public API classes non-extensible.
This
> >> will severely restrict the extensibility of this framework which I want
> to
> >> avoid. I am not sure why we need to remove public constructors. It
would
> be
> >> helpful if you could refer me to any relevant discussion thread.
>
> >  Allowing extensibility is a huge burden on library maintainers.  The
> >  library must have been designed to support it; hence, you should
> >  first describe what kind(s) of extensions (with usage examples) you
> >  have in mind.
> --The library should be extensible to support customization. Users should
> be able to customise or provide their own implementation of genetic
> operators for crossover and mutation. The chromosome classes should also
be
> open for extension.

>I don't get why we should support extensions outside this library.
*--* I think we should not block the extension.

>Initially we discussed about having a light-weight library, for easier
usage
>than alternative existing framework(s).
*--* We can always think of making the framework lightweight but it should
not cost extensibility.

>> E.g. any developer should be able to extend the
>> IntegralChromosome class and define a child class which explicitly
>> specifies the range of integers to be used.

>It does not look like this would need an extension, only configuration
>of the range.
*-- *I agree. But the question is should we block the extension.

>> I have initially implemented
>> the Binary chromosome and the corresponding binary mutation following the
>> same pattern. However, restricting extension of concrete classes by
private
>> constructor does not prevent users from extending the abstract parent
>> classes.

>We should aim at coding the GA logic through (Java) interfaces, and not
>expose the "abstract" classes.
*-- *One of the primary reasons for me to contribute in Apache' GA library
is it's simplicity and extensibility. I would like to have a framework
which should be always extensible for any problem domain with minor
changes. The primary reason behind this is that application domains of GA
are too diverse. It is not possible to implement everything in a library.
We don't know all possible domain areas too. If we remove the extensibility
from the framework it would be useless in lots of areas.

>Extending the functionality, if necessary, should be contributed back here
*-- *Sometimes the GA operators are very much specific to the domain and
it's hard to generalise. In those scenarios contributing back to the
library might not be possible. However, if a library cannot be extended for
a new domain by users it becomes underutilised over time if not useless.


Thanks & Regards
--Avijit Basak

On Tue, 21 Dec 2021 at 22:05, Gilles Sadowski  wrote:

> Hello.
>
> Le mar. 21 déc. 2021 à 16:21, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please see my comments. Sorry for the delayed response.
> >
> > >Several problems with this approach (raised in previous messages IIRC):
> > >1. Potential performance loss in sharing the same RNG 

Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-12-21 Thread Avijit Basak
Hi All

Please see my comments. Sorry for the delayed response.

>Several problems with this approach (raised in previous messages IIRC):
>1. Potential performance loss in sharing the same RNG instance.
-- As per my understanding ThreadLocalRandomSource creates separate
instances of UniformRandomProvider for each thread. So I am not sure how a
UniformRandomProvider instance is being shared. Please correct me if I am
wrong.
>2. Less/no flexibility (no user's choice of random source).
-- Agreed.
>3. Error-prone (user can access/reuse the "UniformRandomProvider"
instances).

>Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct but
>"light" usage of random number generation in a multi-threaded application;
GAs
>make "heavy" use of RNG, thus it is does not seem outlandish that all the
RNG
>"clients" (e.g. every "operator") creates their own instances.


>IMHO, a more important discussion would be about the expectations in a
>multithreaded context: E.g. should an operator be shareable by different
>threads?  And if not, how does the API help application developers to avoid
>such pitfalls?
-- Once we implement multi-threading in GA, same crossover and mutation
operators will be re-used across multiple threads. So even if we provide
the customization at the operator level we cannot avoid sharing.

>> My original implementation did not allow any customization of
RandomSource
>> instances. There was a thought in review for customization of
RandomSource,
>> so these options were considered. I don't think this would make any
>> difference to algorithm functionality.

>  Quite right.  But the customization can come at zero cost for the users
>  who don't need it. Admittedly it's a little more work on the part of the
>  developer(s) but it's a one off cost (and I'm fine working on that part
of
>  the library once other, more important, things have been settled).

>> Even earlier I used Math.random()
>> which worked equally well. So my *vote* should be *against* this
>> customization.

>  Mine is against using "ThreadLocalRandomSource"...
-- What is the wayout other than that. Please suggest.

>> I think first we need to decide on whether we really need this
>> customization and if yes then why. Then we can decide on alternate
>> implementation options.
>
>> >As per the recent updates of the math-related code bases, the
>> >public API should provide factory methods (constructors should
>> >be private).
>> -- private constructors will make public API classes non-extensible. This
>> will severely restrict the extensibility of this framework which I want
to
>> avoid. I am not sure why we need to remove public constructors. It would
be
>> helpful if you could refer me to any relevant discussion thread.

>  Allowing extensibility is a huge burden on library maintainers.  The
>  library must have been designed to support it; hence, you should
>  first describe what kind(s) of extensions (with usage examples) you
>  have in mind.
--The library should be extensible to support customization. Users should
be able to customise or provide their own implementation of genetic
operators for crossover and mutation. The chromosome classes should also be
open for extension. E.g. any developer should be able to extend the
IntegralChromosome class and define a child class which explicitly
specifies the range of integers to be used. I have initially implemented
the Binary chromosome and the corresponding binary mutation following the
same pattern. However, restricting extension of concrete classes by private
constructor does not prevent users from extending the abstract parent
classes.


Thanks & Regards
--Avijit Basak


On Tue, 30 Nov 2021 at 19:20, Gilles Sadowski  wrote:

> Hi.
>
> Le mar. 30 nov. 2021 à 06:40, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please see my comments:
> >
> > >The provider returned from ThreadLocalRandomSource.current(...) should
> > >only be used within a single method.
> > -- I missed the context of the thread in my previous mail. Sorry for the
> > previous communication. We can only cache the RandomSource's enum value
> and
> > reuse the same locally in other methods. According to the analysis, the
> > current implementation(In PR#199) with pre-configured RandomSource would
> > work correctly.
> > --CUT--
> > public final class RandomProviderManager {
> > /** The default RandomSource for random number generation. **/
> > private static RandomSource randomSource =
> > RandomSource.XO_RO_SHI_RO_128_PP;
> > /**
> >  * constructs the singleton instance.
> >

Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-11-29 Thread Avijit Basak
Hi All

Please see my comments:

>The provider returned from ThreadLocalRandomSource.current(...) should
>only be used within a single method.
-- I missed the context of the thread in my previous mail. Sorry for the
previous communication. We can only cache the RandomSource's enum value and
reuse the same locally in other methods. According to the analysis, the
current implementation(In PR#199) with pre-configured RandomSource would
work correctly.
--CUT--
public final class RandomProviderManager {
/** The default RandomSource for random number generation. **/
private static RandomSource randomSource =
RandomSource.XO_RO_SHI_RO_128_PP;
/**
 * constructs the singleton instance.
 */
private RandomProviderManager() {}
/**
 * Returns the (static) random generator.
 * @return the static random generator shared by GA implementation
classes
 */
public static UniformRandomProvider getRandomProvider() {
return
ThreadLocalRandomSource.current(RandomProviderManager.randomSource);
}
}
--CUT--

@Alex Herbert , kindly share if you see any
challenge to this.
My original implementation did not allow any customization of RandomSource
instances. There was a thought in review for customization of RandomSource,
so these options were considered. I don't think this would make any
difference to algorithm functionality. Even earlier I used Math.random()
which worked equally well. So my *vote* should be *against* this
customization.
I think first we need to decide on whether we really need this
customization and if yes then why. Then we can decide on alternate
implementation options.

>As per the recent updates of the math-related code bases, the
>public API should provide factory methods (constructors should
>be private).
-- private constructors will make public API classes non-extensible. This
will severely restrict the extensibility of this framework which I want to
avoid. I am not sure why we need to remove public constructors. It would be
helpful if you could refer me to any relevant discussion thread.


Thanks & Regards
--Avijit Basak


On Mon, 29 Nov 2021 at 23:47, Gilles Sadowski  wrote:

> Le lun. 29 nov. 2021 à 19:07, Alex Herbert  a
> écrit :
> >
> > Note that your examples have incorrect usage of ThreadLocalRandomSource:
>
> The detailed explanation confirms what I hinted at previously: We
> should not use "ThreadLocalRandomSource" from within the library
> because we can easily do otherwise (and just as transparently for
> the user).
>
> Gilles
>
> > [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-11-28 Thread Avijit Basak
Hi All

   Here is a sample use of two options.

*Option1*: Declaring factory interface in MutationPolicy, CrossoverPolicy
and SelectionPolicy. A sample implementation has been shown here for
MutationPolicy. Similar would be required for all other relevant interfaces
and implemented classes.

--CUT--

 public interface MutationPolicy {
 Chromosome mutate(Chromosome original, double mutationRate);

 interface Factory {
 /**
  * Creates an instance with a dedicated source of randomness.
  *
  * @param rng RNG algorithm.
  * @param seed Seed.
  * @return an instance that must not be shared among
 threads.
  */
 MutationPolicy create(RandomSource rng, Object... args);

 default MutationPolicy create(RandomSource rng) {
 return create(rng, null);
 }
 default MutationPolicy create() {
 return create(RandomSource.SPLIT_MIX_64);
 }
 }
 }
//Implementation Class
public class IntegralValuedMutation implements MutationPolicy {

private final UniformRandomProvider provider;

private IntegralValuedMutation(RandomSource rng) {
 provider = ThreadLocalRandomSource.current(rng);
}
...
...
public static class MutationFactory implements Factory {
private static final MutationFactory instance = new
MutationFactory<>();
private MutationFactory() {}

@Override
public MutationPolicy create(RandomSource rng, Object... args) {
return new IntegralValuedMutation<>(args[0], args[1]);
}
public static  MutationFactory getInstance() {
return instance;
}
}
//Usage
MutationPolicy policy =
IntegralValuedMutation.MutationFactory.getInstance().create();
--CUT--

Option2:  Optional constructor argument can also be used as an alternative
solution.
--CUT--
public class IntegralValuedMutation implements MutationPolicy {
private final UniformRandomProvider provider;
public IntegralValuedMutation() {
provider = ThreadLocalRandomSource.current(RandomSource.DEFAULT);
//DEFAULT is a chosen source.
}
public IntegralValuedMutation(RandomSource rng) {
provider = ThreadLocalRandomSource.current(rng);
}
...
}
//Usages
MutationPolicy policy = new IntegralValuedMutation(rng);
--CUT--

Option2 looks to be much simpler regarding implementation and I would vote
for the same if we decide to allow customization of RandomSource.

Thanks & Regards
--Avijit Basak


On Mon, 22 Nov 2021 at 19:28, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 22 nov. 2021 à 13:49, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > I would like to request everyone to share their opinion regarding
> > use and customization of RNG functionality in the Genetic Algorithm
> > library.
> > In current design RNG functionality has been used internally by
> the
> > RandomProviderManager class. This class encapsulates a predefined
> instance
> > of RandomSource and utilizes the same for all random number generation
> > requirements. This makes the API cleaner and easy to use for users.
> > However, during the review an alternate thought has been proposed
> > related to customization of RandomSource by users. According to the new
> > proposal the users will be able to provide a RandomSource instance of
> their
> > choice to the crossover and mutation operators and other places like
> > ChromosomeRepresentationUtils. The drawback of this customization could
> be
> > increased complexity of the API.
>
> Please provide an usage example of both (showing that the alternative
> would actually increase the API complexity).
>
> Thanks,
> Gilles
>
> > We need to decide here whether we really need this kind of
> > customization by users and if yes the method of doing so. Here two
> options
> > have been proposed.
> > *Option1:*
> > ---CUT---
> > public interface MutationPolicy {
> > Chromosome mutate(Chromosome original, double mutationRate);
> >
> > interface Factory {
> > /**
> >  * Creates an instance with a dedicated source of randomness.
> >  *
> >  * @param rng RNG algorithm.
> >  * @param seed Seed.
> >  * @return an instance that must not be shared among
> > threads.
> >  */
> > MutationPolicy create(RandomSource rng, Object seed);
> >
> > default MutationPolicy create(RandomSource rng) {
> > return create(rng, null);
> > }
> > default MutationPolicy create() {
> > return create(RandomSource.SPLIT_MIX_64);
> > }
> >

Re: [MATH][GENETICS][PR#199] Design Decision of Chromosome hierarchy

2021-11-22 Thread Avijit Basak
Hi All

I have uploaded the image(*chromosome hierarchy.png*) in JIRA. Here
is the link. Let me know if anyone faces any issues.
https://issues.apache.org/jira/projects/MATH/issues/MATH-1563?filter=allopenissues

Thanks & Regards
--Avijit Basak

On Mon, 22 Nov 2021 at 19:39, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 22 nov. 2021 à 13:47, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > We need to make a decision on the chromosome hierarchy, proposed
> for commons-math-ga module.
> > Currently the hierarchy is designed as shown in the diagram below.
>
> Image has probably been stripped from your original message.
> Please upload it to JIRA and post a link here.
>
> Thanks,
> Gilles
>
> >
> >
> > Brief description:
> > 1) The chromosome hierarchy is based on it's internal representation of
> Genotype.
> > 2) The phenotype of chromosomes is kept as a Generic parameter .
> > 3) Decoder is introduced to convert Genotype to Phenotype.
> > 4) FitnessFunction is introduced to calculate Fitness of chromosomes.
> > 5) AbstractChromosome represents the chromosome abstraction for all
> genotypes.
> > 6) AbstractListChromosome has been introduced to represent the
> abstraction for List based Genotype.
> > 7) Any chromosome representing list based genotypes should extend
> AbstractListBasedChromosome.
> > 8) All other chromosomes should extend the AbstractChromosome class.
> > 9) BinaryChromosome(not committed) is introduced to represent binary
> genotypes and extends AbstractChromosome.
> >
> > Pros:
> > 1) This hierarchy maintains a separation of Genotype and Phenotype.
> > 2) Chromosome class with the same genotype can represent different
> phenotypes with different implementations of Decoders.
> > 3) Users will be able to use primitive types for higher dimensions by
> extending the AbstractChromosome class.
> > 4) Unlike the legacy model all concrete chromosomes are reusable with
> proper implementation of FitnessFunction and Decoder.
> > 5) Any custom list based genotypes can be implemented by extending
> AbstractListChromosome class.
> > 6) Internal genotype representations have been exposed which enabled the
> reuse of crossover and mutation operators.
> >
> > I would like to request everyone to review the design and reply
> in case of any concerns.
> >
> >
> > Thanks & Regards
> > -- Avijit Basak
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


[MATH][GENETICS][PR-199] Decision on the use of Logging functionality

2021-11-22 Thread Avijit Basak
Hi All

   We need to make a decision on usage of a logging framework. The
previous release does not have any implementation of a logging framework.
However, in the current implementation a slf4j logger has been introduced.
Please share if anyone has any concerns related to this.

Thanks & Regards
-- Avijit Basak


[MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization

2021-11-22 Thread Avijit Basak
Hi All

I would like to request everyone to share their opinion regarding
use and customization of RNG functionality in the Genetic Algorithm
library.
In current design RNG functionality has been used internally by the
RandomProviderManager class. This class encapsulates a predefined instance
of RandomSource and utilizes the same for all random number generation
requirements. This makes the API cleaner and easy to use for users.
However, during the review an alternate thought has been proposed
related to customization of RandomSource by users. According to the new
proposal the users will be able to provide a RandomSource instance of their
choice to the crossover and mutation operators and other places like
ChromosomeRepresentationUtils. The drawback of this customization could be
increased complexity of the API.
We need to decide here whether we really need this kind of
customization by users and if yes the method of doing so. Here two options
have been proposed.
*Option1:*
---CUT---
public interface MutationPolicy {
Chromosome mutate(Chromosome original, double mutationRate);

interface Factory {
/**
 * Creates an instance with a dedicated source of randomness.
 *
 * @param rng RNG algorithm.
 * @param seed Seed.
 * @return an instance that must not be shared among
threads.
 */
MutationPolicy create(RandomSource rng, Object seed);

default MutationPolicy create(RandomSource rng) {
return create(rng, null);
}
default MutationPolicy create() {
return create(RandomSource.SPLIT_MIX_64);
}
}
}
---CUT---

*Option 2:*
Use of an optional constructor argument for all crossover and mutation
operators. Users will be providing a RandomSource instance of their choice
or use the default one configured while instantiating the operators.

Thanks & Regards
-- Avijit Basak


[MATH][GENETICS][PR#199] Decision on retention of ASCII Art in Javadoc section

2021-11-22 Thread Avijit Basak
Hi All

I would like to inform everyone that there is some ASCII art in the
javadoc section in some classes like OnePointCrossover etc. This is taken
unaltered from the previous release of math library. We need to decide
whether we should keep them in the next release or remove them. I would
like to request everyone to share their opinion.

Thanks & Regards
-- Avijit Basak


[MATH][GENETICS][PR#199] Design Decision of Chromosome hierarchy

2021-11-22 Thread Avijit Basak
Hi All

We need to make a decision on the chromosome hierarchy, proposed
for *commons-math-ga* module.
Currently the hierarchy is designed as shown in the diagram below.

[image: image.png]
*Brief description:*
1) The chromosome hierarchy is based on it's internal representation of
Genotype.
2) The phenotype of chromosomes is kept as a Generic parameter <*P>*.
3) Decoder is introduced to convert Genotype to Phenotype.
4) FitnessFunction is introduced to calculate Fitness of chromosomes.
5) AbstractChromosome represents the chromosome abstraction for all
genotypes.
6) AbstractListChromosome has been introduced to represent the abstraction
for List based Genotype.
7) Any chromosome representing list based genotypes should extend
AbstractListBasedChromosome.
8) All other chromosomes should extend the AbstractChromosome class.
9) BinaryChromosome(not committed) is introduced to represent binary
genotypes and extends AbstractChromosome.

*Pros:*
1) This hierarchy maintains a separation of Genotype and Phenotype.
2) Chromosome class with the same genotype can represent different
phenotypes with different implementations of Decoders.
3) Users will be able to use primitive types for higher dimensions by
extending the AbstractChromosome class.
4) Unlike the legacy model all concrete chromosomes are reusable with
proper implementation of FitnessFunction and Decoder.
5) Any custom list based genotypes can be implemented by extending
AbstractListChromosome class.
6) Internal genotype representations have been exposed which enabled the
reuse of crossover and mutation operators.

I would like to request everyone to review the design and reply in
case of any concerns.


Thanks & Regards
-- Avijit Basak


Re: [MATH][GENETICS] Review of PR #197

2021-11-09 Thread Avijit Basak
Hi All

>Depending on released code (e.g. version 3.6.1 of Commons Math)
>is fine too for the "main" codes of the "examples" module.
>Setting up all the "policies" (mutation, crossover, ...) must be done
>anyways, by the user; the above factories just add an argument to
>be passed at instantiation.

>> Separate RandomSource for each place may be
>> redundant.

>I'm not sure what you mean by "redundant".
>Since the RNG instances are not thread-safe, either separate sources
>*must* be used or the synchronization must be handled separately (as
>done for the "ThreadLocalRandomSource" possibly with a performance
>loss).

>> I also believe the system should provide a default option for
>> RandomSource instead of completely depending on the choice of users.

>Why?
>This is a library, and the RNG should be viewed as an input (user's choice
>to be made at the application level).
>There are multiple problems (already noted previously):
>1. It hardcodes a specific "default" source (whereas the GA functionality
>is "agnostic" about which source is actually used).
>2. The RNG instance is shared among all the classes that need it (which
>makes access unnecessarily slower).
>3. The static field "randomSource" is mutable; this is looking for trouble
>(race condition).

--The way I tried to design this is that GA functionality should likely be
agnostic to the rng feature. If we accept the randomsource as a parameter
to GA operators it would likely mandate the users having knowledge about
rng. Given the vast amount of RandomSources I am not sure how many options
will be really considered by users and mostly might go for JDK source. This
will increase the learning curve of users as well as a bit of complexity of
API. Customization of any operators will mandate the implementation of the
factory as well.
[IMHO] The users do not need to choose the RandomSource for each operator
separately. That is the reason I proposed the use of customization option
in the RandomNumberGenerator class itself. But this is also true that it
cannot mandate the configuration only once. In case that is configured
inside any custom operator then that might result in race conditions once
we go for parallel implementation.
[IMHO] We really don't need users to customize RandomSource for each and
every operator or for the application. Can we stick to the previous
implementation and remove the configure(RandomSource rng) method of
RandomNumberGenerator class? Kindly share your thoughts.

Thanks & Regards
--Avijit Basak

On Tue, 9 Nov 2021 at 00:11, Gilles Sadowski  wrote:

> Hello.
>
> Side-note: Please try to not remove quotes from previous emails,
> as it leads to a confusing sequence (e.g. things I wrote previously
> now appear below as quotes from your last message).
>
> Le dim. 7 nov. 2021 à 10:34, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  Please find my comments below:
> >
> >
> > [...]
> >
> > > >
> > > > (B)
> > > > I'm confused by your defining "legacy" packages in new modules...
> > > > What kind of comparisons are you considering?
> > > > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?)
> > > > regression testing; but please note that when your proposal is
> > > > merged, it will imply that the "legacy" codes *must* be removed.
> > > > [We don't want to keep duplicate functionality.]
> > > > -- The new implementation has improved the quality of optimization
> over
> > > the
> > > > legacy model.
> > >
> > > "Improved" in what sense?
> > > If you mean enhanced performance, such checks should be done
> > > using JMH (producing data to be published on the web site).
> > >
> > > --Along with performance and memory utilization, stochastic algorithms
> > have
> > > another comparison parameter "quality of result". In stochastic
> > algorithms,
> > > global optimum is not guaranteed, We have to compare the quality of the
> > > result along with performance and memory consumption to compare two
> > > algorithm implementations. I have kept the legacy example just for
> > > comparison between the new and the legacy implementations.
> >
> > Great that you take care of checking improvements on the quality
> > measures.  Just make sure that the new code do not depend on
> > anything in module "commons-math-legacy".  As noted earlier, a
> > dependency on a previous release of CM with
> >   test
> > is fine.
> >
> > --test scope can only

Re: [MATH][GENETICS] Review of PR #197

2021-11-07 Thread Avijit Basak
houghts regarding this.

--CUT--
/** The default RandomSource for random number generation. **/
private static RandomSource randomSource =
RandomSource.XO_RO_SHI_RO_128_PP;

/**
 * Sets the random source for this random generator.
 * @param randomSource
 */
public static void configure(RandomSource randomSource) {
RandomNumberGenerator.randomSource = randomSource;
}

/**
 * constructs the singleton instance.
 */
private RandomNumberGenerator() {
}

/**
 * Returns the (static) random generator.
 * @return the static random generator shared by GA implementation
classes
 */
public static UniformRandomProvider getRandomGenerator() {
return
ThreadLocalRandomSource.current(RandomNumberGenerator.randomSource);
}
--CUT--

> >
> > * Class "ValidationUtils"
> > -> should not be public (or should be defined in an "internal" package).
> > --Changed
>
> The class actually provides no added value.
>
> --I was thinking of having a validation utility which can be reused
> everywhere. Otherwise I have to duplicate the code in all places. Do you
> think that is a good way of doing this?

Reuse is good, sure; but in this case, the very small gain (one line)
is not worth the loss of clarity (method call vs direct conditional test).

--Made changes

>> [...]
> >
> > (E) Unit tests
> > * src/test
> > 1. New tests should use Junit 5.
> > 2. "Example usages" probably belong in the "examples-ga" module.
> > --JUnit version is inherited from commons-math module.
>
> Not sure what you mean.
> It's possible to use Junit5 in new modules/test classes even if
> other classes use older Junit.
> --Do you think it is fine to have two separate versions of JUnit library
in
> CM. [IMHO] we should keep only one version only.

In the mid-term, yes.  But the point is that we must start somewhere.
And as you are creating a new tests suite, it's worth using the up-to-date
framework.  [This will also reduce the burden on the people who'll take
on updating all the other tests.]

--Done

Thanks & Regards
--Avijit Basak

On Mon, 1 Nov 2021 at 21:09, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 1 nov. 2021 à 08:56, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please find my comments below:
> >
> > >
> > > Hi All
> > >
> > > I have fixed most of the review comments. The changes have been
> > > committed to PR#199.
> > >
> > > (A)
> > > Please "rebase" on "master".
> > > Please "squash" intermediate commits: For a new feature, a single
> commit
> > > should exist (that corresponds to the JIRA report describing it).
> > > --Will be done once all changes are finalized and committed.
> >
> > What is the rationale for not doing it right now?
> > The PR should always be "rebased" on the latest "master".
> >
> > --Done both rebase and squash.
>
> Thanks.
>
> The convention for the log summary (first line of the log message) is to
> put the issue number in front; thus, instead of
> ---CUT---
> Developed the new genetic algorithm module following the JIRA MATH-1563.
> ---CUT---
> it should be something like
> ---CUT---
> MATH-1563: Introducing new genetic algorithm module.
> ---CUT---
>
> > >
> > > (B)
> > > I'm confused by your defining "legacy" packages in new modules...
> > > What kind of comparisons are you considering?
> > > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?)
> > > regression testing; but please note that when your proposal is
> > > merged, it will imply that the "legacy" codes *must* be removed.
> > > [We don't want to keep duplicate functionality.]
> > > -- The new implementation has improved the quality of optimization over
> > the
> > > legacy model.
> >
> > "Improved" in what sense?
> > If you mean enhanced performance, such checks should be done
> > using JMH (producing data to be published on the web site).
> >
> > --Along with performance and memory utilization, stochastic algorithms
> have
> > another comparison parameter "quality of result". In stochastic
> algorithms,
> > global optimum is not guaranteed, We have to compare the quality of the
> > result along with performance and memory consumption to compare two
> > algorithm implementations. I have kept the legacy example just for
> > comparison between the new and the legacy implementations.
>
> Great that you take care of

Re: [MATH][GENETICS] Review of PR #197

2021-11-01 Thread Avijit Basak
t;NullPointerException") are necessary,
> there could be a factory for creating the appropriate instance.
> However, for "null" checks, please use the JDK utilities[2].
> --Moved to an internal package. Null checks have been modified too.
>
> >
> > * Class "ConvergenceListenerRegistry"
> > Shouldn't it be thread-safe?
> > -- Yes. We need this to be thread-safe for parallel multi-population
> > parallel genetic algorithms.
> --No change for the time being.
>
> (E) Unit tests
> * src/test
> 1. New tests should use Junit 5.
> 2. "Example usages" probably belong in the "examples-ga" module.
> --JUnit version is inherited from commons-math module.

Not sure what you mean.
It's possible to use Junit5 in new modules/test classes even if
other classes use older Junit.
--Do you think it is fine to have two separate versions of JUnit library in
CM. [IMHO] we should keep only one version only.

> --I could not understand what is meant by "Example usages" here. Which
> component is being referred to here.

I'm referring to a comment such as
---CUT---
// to test a stochastic algorithm is hard, so this will rather
be an usage
// example
---CUT---
(at line 72 in "GeneticAlgorithmTestBinary.java").
--Removed. This was an existing comment from previous release.
>
> (F) Code readability
> * Please write one argument per line.
> * Write one condition check per line.
> * Avoid comments with no added value (like "constructor" for a
constructor).
> * Avoid "ASCII art" (see e.g. "OnePointCrossover"); a link[2] is often
> preferable.
> * Do no duplicate documentation (see e.g. "OnePointCrossover").
> --I have formatted the method declaration to have one parameter in one
line.
> --Most of the if conditions are having a single condition except very few
> pre existing ones. I could not see any way to format the if statement in
> eclipse like the suggestion. I cannot introduce any formatting rule which
> cannot be handled in eclipse as that will be very hard to manage.

?
[I can't imagine that Eclipse won't let you add a newline.]
--I searched a little bit but could not find anything relevant. However, I
found the following reference
https://stackoverflow.com/questions/31808237/formatting-if-else-in-eclipse
Putting parenthesis is not an option. Let me know if you find anything
relevant to this.

> --ASCII art and other crossover classes are untouched for this release.

What do you mean by "this release"?
In this instance, it is easy to make the docs clearer by using a link
rather than ASCII figures.  If you want to argue that the latter should
be kept, please start a new thread in order to collect other opinions.
--I can start making the changes.


Thanks & Regards
--Avijit Basak

On Sat, 30 Oct 2021 at 07:11, Gilles Sadowski  wrote:

> Le ven. 29 oct. 2021 à 17:00, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > I have fixed most of the review comments. The changes have been
> > committed to PR#199.
> >
> > (A)
> > Please "rebase" on "master".
> > Please "squash" intermediate commits: For a new feature, a single commit
> > should exist (that corresponds to the JIRA report describing it).
> > --Will be done once all changes are finalized and committed.
>
> What is the rationale for not doing it right now?
> The PR should always be "rebased" on the latest "master".
>
> >
> > (B)
> > I'm confused by your defining "legacy" packages in new modules...
> > What kind of comparisons are you considering?
> > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?)
> > regression testing; but please note that when your proposal is
> > merged, it will imply that the "legacy" codes *must* be removed.
> > [We don't want to keep duplicate functionality.]
> > -- The new implementation has improved the quality of optimization over
> the
> > legacy model.
>
> "Improved" in what sense?
> If you mean enhanced performance, such checks should be done
> using JMH (producing data to be published on the web site).
>
> > I have added the legacy packages to demonstrate the same.
> > Once we remove the genetics packages in the legacy module, the same will
> be
> > deleted from examples.
>
> I'm probably missing what exactly those "legacy" examples aim to
> demonstrate...
> In passing, what's the purpose of
>   Thread.sleep(5000)
> (at line 55 in file "TSPOptimizerLegacy")?
>
> >
> > (C)
> > File
> >
> "commons-math-examples/examples-ga/src/main/resou

Re: [MATH][GENETICS] Review of PR #197

2021-10-29 Thread Avijit Basak
quot; vs "NullPointerException") are necessary,
there could be a factory for creating the appropriate instance.
However, for "null" checks, please use the JDK utilities[2].
--Moved to an internal package. Null checks have been modified too.

>
> * Class "ConvergenceListenerRegistry"
> Shouldn't it be thread-safe?
> -- Yes. We need this to be thread-safe for parallel multi-population
> parallel genetic algorithms.
--No change for the time being.

(E) Unit tests
* src/test
1. New tests should use Junit 5.
2. "Example usages" probably belong in the "examples-ga" module.
--JUnit version is inherited from commons-math module.
--I could not understand what is meant by "Example usages" here. Which
component is being referred to here.


(F) Code readability
* Please write one argument per line.
* Write one condition check per line.
* Avoid comments with no added value (like "constructor" for a constructor).
* Avoid "ASCII art" (see e.g. "OnePointCrossover"); a link[2] is often
preferable.
* Do no duplicate documentation (see e.g. "OnePointCrossover").
--I have formatted the method declaration to have one parameter in one line.
--Most of the if conditions are having a single condition except very few
pre existing ones. I could not see any way to format the if statement in
eclipse like the suggestion. I cannot introduce any formatting rule which
cannot be handled in eclipse as that will be very hard to manage.
--ASCII art and other crossover classes are untouched for this release.

(G)
Some files contain "tab" characters (e.g. "pom.xml").
--Removed tab characters.

Thanks & Regards
--Avijit Basak

On Thu, 21 Oct 2021 at 21:32, Gilles Sadowski  wrote:

> Le mer. 20 oct. 2021 à 08:47, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >  Thanks for the review comments. I have started making the
> changes.
> > However, I have some queries regarding some of comments as noted below:
>
> Some (partial) answers below.
>
> >
> > (B)
> > I'm confused by your defining "legacy" packages in new modules...
> > --This is kept for comparison purposes between the legacy and the new
> > implementation of GA.
>
> What kind of comparisons are you considering?
> It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?)
> regression testing; but please note that when your proposal is
> merged, it will imply that the "legacy" codes *must* be removed.
> [We don't want to keep duplicate functionality.]
>
> > (D) General design
> > Class "ConsoleLogger"
> > -> We should not reinvent the wheel.  We should consider whether logging
> > is necessary, and in the affirmative, depend on the de facto standard:
> > "slf4j".
> > -- I don't see any use of a logging framework in the math library.
>
> There is a long history of not wanting any kind of dependency.
> But this ship has sailed.
>
> > That is
> > the reason I introduced ConsoleLogger. If we introduce a logging
> framework
> > we won't need this class at all. I think we should include the logger in
> > the root(commons-math\) pom.xml file so that all modules should be able
> to
> > use this.
>
> Starting with the upcoming release, we can decide on a per-module
> basis.  Please make the case (in a new ML thread) for introducing
> such a dependency in the GA module.
>
> >
> > Class "Constants"
> > -> Any data should be declared where its purpose is obvious.
> > -- We can declare the constants where it belongs but this might introduce
> > duplicate constants across different classes and hence reduce
> reusability.
>
> The class does not mention where the data is used, nor why it is
> necessary that it be "public".
> By default, the leaner API (i.e. no unnecessary "public" components),
> the better (even if sometimes that would entail duplicating "private"
> data).
> [TBD on a case-by-case basis.]
>
> >
> > * Class "AbstractListChromosome" (and subclasses)
> > Didn't we conclude that this was a very wasteful implementation of the
> > "chromosome" concept?
> > -- I have some concerns regarding this. I am not much aware of any
> > discussion regarding this conclusion.
>
> Please search the ML archive; I seem to recall a detailed discussion
> where Alex gave hints on how a binary chromosome should be
> implemented.
>
> > Chromosomes are always conceptualized as collections of allele/genes. So
> We
> > need a collection of the *genotypes* anyway. Here List has been used as a
> > collection.
> > We need an abstraction for r

Re: [MATH][GENETICS] Review of PR #197

2021-10-20 Thread Avijit Basak
Hi

 Thanks for the review comments. I have started making the changes.
However, I have some queries regarding some of comments as noted below:

(B)
I'm confused by your defining "legacy" packages in new modules...
--This is kept for comparison purposes between the legacy and the new
implementation of GA.

(D) General design
Class "ConsoleLogger"
-> We should not reinvent the wheel.  We should consider whether logging
is necessary, and in the affirmative, depend on the de facto standard:
"slf4j".
-- I don't see any use of a logging framework in the math library. That is
the reason I introduced ConsoleLogger. If we introduce a logging framework
we won't need this class at all. I think we should include the logger in
the root(commons-math\) pom.xml file so that all modules should be able to
use this.

Class "Constants"
-> Any data should be declared where its purpose is obvious.
-- We can declare the constants where it belongs but this might introduce
duplicate constants across different classes and hence reduce reusability.

* Class "AbstractListChromosome" (and subclasses)
Didn't we conclude that this was a very wasteful implementation of the
"chromosome" concept?
-- I have some concerns regarding this. I am not much aware of any
discussion regarding this conclusion.
Chromosomes are always conceptualized as collections of allele/genes. So We
need a collection of the *genotypes* anyway. Here List has been used as a
collection.
We need an abstraction for representing the collection of Genotype. All
crossover and mutation operators are based on this abstraction. This
enabled reuse of crossover and mutation operators for all chromosome types
which extend the abstraction. I am not sure how to achieve this reusability
without an abstraction.
Any domain specific new chromosome implementation extending the
AbstractListChromosome class can reuse all crossover and mutation operators.
For our proposed improvement of BinaryChromosome we should be able to
extend the AbstractChromosome (*not* AbstractListChromosome) for the new
class and provide the dedicated crossover and mutation operators for the
corresponding Genotype. Without an *explicit* abstraction, management of
crossover and mutation operators would be difficult.
Please share further thoughts regarding this.

* Class "GeneticException"
1. Should not be public (or should be defined in an "internal" package"[1]).
2. If various types (that map to different JDK subclasses of
RuntimeException,
e.g. "IllegalArgumentException" vs "NullPointerException") are necessary,
there could be a factory for creating the appropriate instance.
However, for "null" checks, please use the JDK utilities[2].
-- As of now we are managing all exception types by single GeneticException
class. So there is no factory.
-- Using JDK utilities for NullPointer would repeat this code in all
places. Is it fine?
Objects.requireNonNull(object,
Message.format(GeneticException.NULL_ARGUMENT, args));

* Class "ConvergenceListenerRegistry"
Shouldn't it be thread-safe?
-- Yes. We need this to be thread-safe for parallel multi-population
parallel genetic algorithms.


Thanks & Regards
--Avijit Basak

On Mon, 18 Oct 2021 at 23:13, Gilles Sadowski  wrote:

> Hello.
>
> Sorry for the delay in reviewing.
>
> Le lun. 18 oct. 2021 à 09:35, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > I have created PR#197 as mentioned earlier. Kindly let me know if
> > there is any concern or comments.
> > I have created another *PR#199* consisting of the changes with
> > adaptive probability generations.
>
> Please find below my first remarks.
>
> (A)
> Please "rebase" on "master".
> Please "squash" intermediate commits: For a new feature, a single commit
> should exist (that corresponds to the JIRA report describing it).
>
> (B)
> I'm confused by your defining "legacy" packages in new modules...
>
> (C)
> File
> "commons-math-examples/examples-ga/src/main/resources/spotbugs/spotbugs-exclude-filter.xml"
> does not belong there.
>
> (D) General design
> Class "ConsoleLogger"
> -> We should not reinvent the wheel.  We should consider whether logging
> is necessary, and in the affirmative, depend on the de facto standard:
> "slf4j".
>
> Class "Constants"
> -> Any data should be declared where its purpose is obvious.
>
> * Class "RandomGenerator"
> 1. Duplicates functionality (storage of thread-local instances)
> readily available in "Commons RNG".
> 2. (IMHO) Thread-local instances should not be used for "heavy" usage
> (like in GA).
>
> * Class "ValidationUtils"
> -> should not

Re: [MATH][GENETICS] Review of PR #197

2021-10-18 Thread Avijit Basak
Hi All

I have created PR#197 as mentioned earlier. Kindly let me know if
there is any concern or comments.
I have created another *PR#199* consisting of the changes with
adaptive probability generations. Kindly review the same. The build has
failed due to a spot bug issue as mentioned below.

[ERROR] Medium: Public static
org.apache.commons.math4.ga.listener.ConvergenceListenerRegistry.getInstance()
may expose internal representation by returning
ConvergenceListenerRegistry.INSTANCE
[org.apache.commons.math4.ga.listener.ConvergenceListenerRegistry] At
ConvergenceListenerRegistry.java:[line 89] MS_EXPOSE_REP

However, the same code in my previous PR(#197) was
built successfully. I am not sure how to resolve this issue. Any help will
be appreciated.

Thanks & Regards
--Avijit Basak

On Mon, 27 Sept 2021 at 18:21, Avijit Basak  wrote:

> Hi All
>
>  I have created the *PR #197* consisting of changes for JIRA
> MATH-1563, Task MATH-1618. This is the primary work to standardize the
> design of GA module. The build has passed. I would like to request a review
> of the PR. Once the primary design is standardized I can check in further
> changes like introduction of adaptive model and data structure change for
> Binary chromosomes.
>  Kindly let me know for any concerns or queries.
>
> Thanks & Regards
> -- Avijit Basak
>


-- 
Avijit Basak


[MATH][GENETICS] Review of PR #197

2021-09-27 Thread Avijit Basak
Hi All

 I have created the *PR #197* consisting of changes for JIRA
MATH-1563, Task MATH-1618. This is the primary work to standardize the
design of GA module. The build has passed. I would like to request a review
of the PR. Once the primary design is standardized I can check in further
changes like introduction of adaptive model and data structure change for
Binary chromosomes.
 Kindly let me know for any concerns or queries.

Thanks & Regards
-- Avijit Basak


Re: [MATH][GENETICS] Build Issue with PR #197

2021-09-26 Thread Avijit Basak
Hi All

I have made all the changes.

Thanks & Regards
--Avijit Basak

On Sat, 25 Sept 2021 at 17:14, Gilles Sadowski  wrote:

> Hello.
>
> Le sam. 25 sept. 2021 à 08:06, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> >  I have created a PR (#197) to merge my changes for Genetic
> Algorithm. Although the build has passed locally for my components
> (commons-math4-genetics and example-genetics) the PR build failed with unit
> test case errors for legacy(commons-math4-legacy) modules.
>
> Please note that maven modules should named "commons-math-Xxx";
> as Alex mentioned previously, you should remove the spurious "4" in
> the "genetics" module (and I'd personally favour "ga" over "genetics", as
> per a previous discussion).  Thus:
>commons-math-ga
> with top-level package
>   org.apache.commons.math4.ga
>
> > I have not changed anything in the legacy module. So not sure what is
> causing those issues.
>
> It is caused by a randomized test ("SimplexOptimizerTest", see the log[1]).
>
> > Could anyone kindly guide me how to pass the PR build in this scenario.
>
> You might try to resubmit the PR.
> [Anyways the build will be performed after you commit the changes
> indicated above.]
>
> Best regards,
> Gilles
>
> > The local build logs are attached herewith for reference.
> >
> > Local build command: mvn clean verify apache-rat:check checkstyle:check
> pmd:check spotbugs:check javadoc:javadoc
> > PR Link: https://github.com/apache/commons-math/pull/197
> >
> > Thanks & Regards
> > -- Avijit Basak
> >
>
> [1] https://app.travis-ci.com/github/apache/commons-math/builds/238471866
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


[MATH][GENETICS] Build Issue with PR #197

2021-09-25 Thread Avijit Basak
Hi All

 I have created a PR (#197) to merge my changes for Genetic
Algorithm. Although the build has passed locally for my components
(commons-math4-genetics and example-genetics) the PR build failed with unit
test case errors for legacy(commons-math4-legacy) modules. I have not
changed anything in the legacy module. So not sure what is causing those
issues. Could anyone kindly guide me how to pass the PR build in this
scenario. The local build logs are attached herewith for reference.

*Local build command:* mvn clean verify apache-rat:check checkstyle:check
pmd:check spotbugs:check javadoc:javadoc
*PR Link*: https://github.com/apache/commons-math/pull/197

Thanks & Regards
-- Avijit Basak

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-09-06 Thread Avijit Basak
Hi All

  I have created a pull request for task *MATH-1618* belonging to
*Jira** MATH-1563*. Kindly initiate the review. Please send a note if you
have any questions or concerns. Development is in progress for the next
task *MATH-1619*.

*URL:* https://github.com/apache/commons-math/pull/197

Thanks & Regards
--Avijit Basak

On Sun, 15 Aug 2021 at 23:17, Gilles Sadowski  wrote:

> Le dim. 15 août 2021 à 15:48, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> > As mentioned earlier I need to use descriptive statistics in
> > *genetics* module as part of *math4* release. This will be required for
> > checking convergence status, probability generation. This can also be
> used
> > for streaming current population conditions to interested listeners.
> > Currently, we have a DescriptiveStatistics class as part of math4.legacy
> > module. Is there any plan to develop a new statistics module like
> neuralnet
> > and genetics?
>
> Not exactly:  Refactored statistics utilities should find a home in the
> the new "Commons Statistics" component.[1]
>
> > If not what is the way to proceed forward. Kindly guide me in
> > this regard.
>
> There are several ways forward:
> 1. You contribute to start work on a "commons-statistics-descriptive"
> maven module in the component mentioned above. ["Commons Math"
> can depend on that component's modules.]
> 2. You make modifications to the GA functionality inside the current
> "o.a.c.m.legacy.genetics" package.  [I'd still advise that we define
> interfaces to whatever functionality (like descriptive statistics) should
> ultimately be implemented somewhere else.]
> 3. You create a new "commons-math-ga" module that does not depend
> on the "commons-math-legacy" module.  [That would imply creating an
> "internal" package (where you can copy anything you need) whose
> contents will not be part of the official API (i.e. users must not rely on
> it being stable across even minor releases).]
>
>
> Regards,
> Gilles
>
> [1] https://commons.apache.org/proper/commons-statistics
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-08-15 Thread Avijit Basak
Hi

As mentioned earlier I need to use descriptive statistics in
*genetics* module as part of *math4* release. This will be required for
checking convergence status, probability generation. This can also be used
for streaming current population conditions to interested listeners.
Currently, we have a DescriptiveStatistics class as part of math4.legacy
module. Is there any plan to develop a new statistics module like neuralnet
and genetics? If not what is the way to proceed forward. Kindly guide me in
this regard.

Thanks & Regards
--Avijit Basak

On Sun, 8 Aug 2021 at 19:14, Gilles Sadowski  wrote:

> Hello.
>
> Le dim. 8 août 2021 à 07:22, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > I have started to work in genetic module.
>
> Great!
>
> > I want to push the new
> > module as part of a new feature branch "*feature/MATH-1563*". Changes
> > include mostly the existing code and modfication due to the new Exception
> > class. I have encountered the following error which indicates my Github
> Id "
> > *avijitbasak*" is not permitted to check-in code in the repository.
>
> Indeed, not all GitHub users are allowed to modify an ASF's project
> repository. ;-)
>
> > Could
> > anyone kindly grant me access to the repository. Let me know if I need to
> > do anything else regarding this.
>
> Only ASF committers[1] are given write access.
> For a contributor who is not (yet) a committer, the (nowadays[2]) usual way
> to suggest changes is through GitHub pull requests (i.e. you have to
> "clone"
> the repository into your projects' GH space and modify there).
>
> Regards,
> Gilles
>
> [1] https://www.apache.org/foundation/how-it-works.html#roles
> [2] The alternative is uploading patches to the issue-tracking system:
> https://commons.apache.org/patches.html
>
> > [...]
>
> ---------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-08-07 Thread Avijit Basak
Hi All

I have started to work in genetic module. I want to push the new
module as part of a new feature branch "*feature/MATH-1563*". Changes
include mostly the existing code and modfication due to the new Exception
class. I have encountered the following error which indicates my Github Id "
*avijitbasak*" is not permitted to check-in code in the repository. Could
anyone kindly grant me access to the repository. Let me know if I need to
do anything else regarding this.

git.exe push --progress "origin" feature/MATH-1563
remote: Permission to apache/commons-math.git denied to avijitbasak.
fatal: unable to access '*https://github.com/apache/commons-math.git/
<https://github.com/apache/commons-math.git/>*': The requested URL returned
error: *403*

Thanks & Regards
--Avijit Basak

On Sun, 8 Aug 2021 at 10:49, Avijit Basak  wrote:

> Hi
>
>  I have created two new subtasks for Jira *MATH-1563* to explain
> the requirement of changes and a new JIRA MATH-1618
> <https://issues.apache.org/jira/browse/MATH-1618>.
>   Let me know if that helps. We can continue the discussion here
> in case of any queries.
>
> Thanks & Regards
> --Avijit Basak
>
> On Wed, 28 Jul 2021 at 23:22, Gilles Sadowski 
> wrote:
>
>> Hello.
>>
>> Le mer. 28 juil. 2021 à 10:23, Avijit Basak  a
>> écrit :
>> >
>> > Hi
>> >
>> > I shall try to describe my proposed changes with proper context
>> in
>> > my next communication. Regarding the stats, I need a library that can be
>> > used for any statistical calculation needed.
>>
>> Are the calculations needed for the GA to work (e.g. as part of a stopping
>> criterion), or are they only meant to inform the user (e.g. for computing
>> current average fitness and the like)?
>>
>> In the latter case, (IIUC) I don't think that we need to introduce such a
>> dependency: Couldn't "out-of-band" functionality be defined through a
>> plugin infrastructure?
>>
>> > I don't want to use the one
>> > from math3 legacy component as that will include all other legacy
>> > components too.
>>
>> If you intend to improve the "genetics" package within the current
>> "commons-math-legacy" module, you can use all the code in there,
>> (including the "o.a.c.math4.stat" package, although that will make it
>> more difficult to create a new module free of those dependencies.
>>
>> Please clarify what goal you are pursuing.
>>
>> > If the commons-statistics component is an isolated one then
>> > that can be re-used once released.
>>
>> I don't understand what you mean.
>>
>> > It will be nice to have a library for plotting graph. Earlier I
>> > used jFreeChart (Lesser GNU Public License), which works fine for this
>> kind
>> > of requirement. Any suggestion regarding this?
>>
>> If you suggest that a Commons component should depend on
>> a plotting library, it's likely "no go".
>> Would a GA implementation need this?
>> Again, if the purpose is to follow progress of the computation, we
>> should define appropriate interfaces to allow data collection in
>> real time.  How those are processed (e.g. plotting statistics of the
>> current population) is probably out-of-scope.
>>
>> Regards,
>> Gilles
>>
>> >
>> > Thanks & Regards
>> > --Avijit Basak
>> >
>> >
>> > On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski 
>> wrote:
>> >
>> > > Hello.
>> > >
>> > > Le mar. 27 juil. 2021 à 09:15, Avijit Basak 
>> a
>> > > écrit :
>> > > >
>> > > > Hi All
>> > > >
>> > > > Please find the proposed changes for the Genetic Algorithm
>> > > library in commons.maths.
>> > > > Changes in Model:
>> > > > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate
>> > > commons implementation in an Abstract class AbstractGeneticAlgorithm.
>> New
>> > > AdaptiveGeneticAlgorithm class has also been introduced.
>> > > > 2) Introduced Elitism interface which is implemented by
>> > > ElitisticListPopulation.
>> > > > 3) Interface Fitness has been removed.
>> > > > 4) Interface FitnessCalculator has been introduced.
>> > > > 5) Chromosome has been updated with FitnessCalculator interface and
>> > > accessor.
>> > > > 6) Operations in AbstractChromosome has be

Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-08-07 Thread Avijit Basak
Hi

 I have created two new subtasks for Jira *MATH-1563* to explain
the requirement of changes and a new JIRA MATH-1618
<https://issues.apache.org/jira/browse/MATH-1618>.
  Let me know if that helps. We can continue the discussion here in
case of any queries.

Thanks & Regards
--Avijit Basak

On Wed, 28 Jul 2021 at 23:22, Gilles Sadowski  wrote:

> Hello.
>
> Le mer. 28 juil. 2021 à 10:23, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> > I shall try to describe my proposed changes with proper context
> in
> > my next communication. Regarding the stats, I need a library that can be
> > used for any statistical calculation needed.
>
> Are the calculations needed for the GA to work (e.g. as part of a stopping
> criterion), or are they only meant to inform the user (e.g. for computing
> current average fitness and the like)?
>
> In the latter case, (IIUC) I don't think that we need to introduce such a
> dependency: Couldn't "out-of-band" functionality be defined through a
> plugin infrastructure?
>
> > I don't want to use the one
> > from math3 legacy component as that will include all other legacy
> > components too.
>
> If you intend to improve the "genetics" package within the current
> "commons-math-legacy" module, you can use all the code in there,
> (including the "o.a.c.math4.stat" package, although that will make it
> more difficult to create a new module free of those dependencies.
>
> Please clarify what goal you are pursuing.
>
> > If the commons-statistics component is an isolated one then
> > that can be re-used once released.
>
> I don't understand what you mean.
>
> > It will be nice to have a library for plotting graph. Earlier I
> > used jFreeChart (Lesser GNU Public License), which works fine for this
> kind
> > of requirement. Any suggestion regarding this?
>
> If you suggest that a Commons component should depend on
> a plotting library, it's likely "no go".
> Would a GA implementation need this?
> Again, if the purpose is to follow progress of the computation, we
> should define appropriate interfaces to allow data collection in
> real time.  How those are processed (e.g. plotting statistics of the
> current population) is probably out-of-scope.
>
> Regards,
> Gilles
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> >
> > On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski 
> wrote:
> >
> > > Hello.
> > >
> > > Le mar. 27 juil. 2021 à 09:15, Avijit Basak  a
> > > écrit :
> > > >
> > > > Hi All
> > > >
> > > > Please find the proposed changes for the Genetic Algorithm
> > > library in commons.maths.
> > > > Changes in Model:
> > > > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate
> > > commons implementation in an Abstract class AbstractGeneticAlgorithm.
> New
> > > AdaptiveGeneticAlgorithm class has also been introduced.
> > > > 2) Introduced Elitism interface which is implemented by
> > > ElitisticListPopulation.
> > > > 3) Interface Fitness has been removed.
> > > > 4) Interface FitnessCalculator has been introduced.
> > > > 5) Chromosome has been updated with FitnessCalculator interface and
> > > accessor.
> > > > 6) Operations in AbstractChromosome has been updated with
> > > FitnessCalculator as interface.
> > > > 7) New BinaryChromosome class has been added.
> > > > 8) Interface PermutationChromosome has been replaced by
> > > IndirectlyEncodedChromosome as the interface primarily represents
> > > chromosomes with indirect encoding. A more appropriate name can be
> > > suggested.
> > > > 9) RandomKey class operations have been updated with
> FitnessCalculator.
> > > > 10) I would like to include a new class PermutationChromosome as we
> have
> > > corresponding crossover operators like OrderedCrossover.
> > > > 11) crossover method in CrossoverPolicy interface has been updated
> with
> > > additional argument probability to support dynamic probability
> generation.
> > > This would impact all implementation classes.
> > > > 12) mutate method in MutationPolicy has been added another argument
> > > probability to support dynamic probability generation. This would
> impact
> > > all implementation classes.
> > > > 13) Two new evolution StoppingCondition has been added
> > > UnchangedAvgFitness and UnchangedBestFitness.

Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-07-28 Thread Avijit Basak
Hi

I shall try to describe my proposed changes with proper context in
my next communication. Regarding the stats, I need a library that can be
used for any statistical calculation needed. I don't want to use the one
from math3 legacy component as that will include all other legacy
components too. If the commons-statistics component is an isolated one then
that can be re-used once released.
It will be nice to have a library for plotting graph. Earlier I
used jFreeChart (Lesser GNU Public License), which works fine for this kind
of requirement. Any suggestion regarding this?

Thanks & Regards
--Avijit Basak


On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski  wrote:

> Hello.
>
> Le mar. 27 juil. 2021 à 09:15, Avijit Basak  a
> écrit :
> >
> > Hi All
> >
> > Please find the proposed changes for the Genetic Algorithm
> library in commons.maths.
> > Changes in Model:
> > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate
> commons implementation in an Abstract class AbstractGeneticAlgorithm. New
> AdaptiveGeneticAlgorithm class has also been introduced.
> > 2) Introduced Elitism interface which is implemented by
> ElitisticListPopulation.
> > 3) Interface Fitness has been removed.
> > 4) Interface FitnessCalculator has been introduced.
> > 5) Chromosome has been updated with FitnessCalculator interface and
> accessor.
> > 6) Operations in AbstractChromosome has been updated with
> FitnessCalculator as interface.
> > 7) New BinaryChromosome class has been added.
> > 8) Interface PermutationChromosome has been replaced by
> IndirectlyEncodedChromosome as the interface primarily represents
> chromosomes with indirect encoding. A more appropriate name can be
> suggested.
> > 9) RandomKey class operations have been updated with FitnessCalculator.
> > 10) I would like to include a new class PermutationChromosome as we have
> corresponding crossover operators like OrderedCrossover.
> > 11) crossover method in CrossoverPolicy interface has been updated with
> additional argument probability to support dynamic probability generation.
> This would impact all implementation classes.
> > 12) mutate method in MutationPolicy has been added another argument
> probability to support dynamic probability generation. This would impact
> all implementation classes.
> > 13) Two new evolution StoppingCondition has been added
> UnchangedAvgFitness and UnchangedBestFitness.
> > 14) An interface ProbabilityGenerator has been introduced with few
> selective implementations to be used by AdaptiveGeneticAlgorithm class. The
> signature of the probability generation method has been kept generic to
> keep strategies interchangeable.
>
> I'd have a hard time commenting as we mostly miss the context: AFAIK,
> nobody here has ever used CM's GA implementation and nobody knows
> how its design structure should be changed in order to improve its
>  * usability,
>  * performance,
>  * robustness,
>  * extensibility, or
>  * maintenance;
> hence the listing of changes is not very useful without some hint as to why
> things are to be modified, removed or added (e.g. pointing to shortcomings,
> missing features, performance bottlenecks, and so on; and create a JIRA
> report for each of them).
> Actually, I understand that it might be a tedious task, and probably not
> worth
> the modest feedback which you may expect in return.  So the best course of
> action is perhaps to implement the new design as you see fit, and then show
> (through applications in "examples" module) how it solves selected
> problems.
>
> Doing so, you could keep us informed of your progress through commenting
> in the appropriate JIRA report(s) and a link to an up-to-date PR.
>
> >   I have few more queries related to repository structure.
> > 1) Do we need to keep package name as math4 and not math. Using a
> version-independent name would ease version migration for developers for
> future releases.
>
> Commons has a strict policy of backwards compatibility of minor releases.
> Changing the top-level package's name is done in every major release in
> order to avoid JAR hell.
>
> > 2) Can we have the stat module out of legacy component.
>
> Are you on to fix all the reported issues?
>
> > This can be useful to calculate population statistics if required.
>
> You are certainly welcome to refactor the parts of the "o.a.c.m.stat"
> package which would be of interest for that purpose.
> Please note that redesign statistical functionalities should be ported
> to the "Commons Statistics" component.[1]
>
> Regards,
> Gilles
>
> [1] https://commons.apache.org/proper/commons-statistics/
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


[MATH][DESIGN] Design Discussion for Genetic Algorithm Library

2021-07-27 Thread Avijit Basak
Hi All

Please find the proposed changes for the Genetic Algorithm library
in commons.maths.
Changes in Model:
1) GeneticAlgorithm class is broken into a hierarchy to accommodate commons
implementation in an Abstract class AbstractGeneticAlgorithm. New
AdaptiveGeneticAlgorithm class has also been introduced.
2) Introduced Elitism interface which is implemented by
ElitisticListPopulation.
3) Interface Fitness has been removed.
4) Interface FitnessCalculator has been introduced.
5) Chromosome has been updated with FitnessCalculator interface and
accessor.
6) Operations in AbstractChromosome has been updated with FitnessCalculator
as interface.
7) New BinaryChromosome class has been added.
8) Interface PermutationChromosome has been replaced by
IndirectlyEncodedChromosome as the interface primarily represents
chromosomes with indirect encoding. A more appropriate name can be
suggested.
9) RandomKey class operations have been updated with FitnessCalculator.
10) I would like to include a new class PermutationChromosome as we have
corresponding crossover operators like OrderedCrossover.
11) crossover method in CrossoverPolicy interface has been updated with
additional argument probability to support dynamic probability generation.
This would impact all implementation classes.
12) mutate method in MutationPolicy has been added another argument
probability to support dynamic probability generation. This would impact
all implementation classes.
13) Two new evolution StoppingCondition has been added UnchangedAvgFitness
and UnchangedBestFitness.
14) An interface ProbabilityGenerator has been introduced with few
selective implementations to be used by AdaptiveGeneticAlgorithm class. The
signature of the probability generation method has been kept generic to
keep strategies interchangeable.

  I have few more queries related to repository structure.
1) Do we need to keep package name as *math4* and not *math*. Using a
version-independent name would ease version migration for developers for
future releases.
2) Can we have the stat module out of legacy component. This can be useful
to calculate population statistics if required.

Kindly share your thoughts.

Thanks & Regards
--Avijit Basak

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: The case for a Commons component

2021-05-14 Thread Avijit Basak
Hi All

This has been a long mail thread. It will be really helpful if
anyone can summarize the decisions.
Is the proposal of developing the new machine learning component
approved?
If the team repository is not provided is there any way to go ahead?
Waiting for a response.

Thanks & Regards
--Avijit Basak

On Fri, 7 May 2021 at 02:26, sebb  wrote:

> On Thu, 6 May 2021 at 21:13, Gary Gregory  wrote:
> >
> > It is true that there much less friction these days to get a repository
> > going with GitHub, GitLab, and BitBucket, but, for now, the Commons
> Sandbox
> > is still available. If we want to do away with the sandbox, then let's
> > talk about that separately.
> >
>
> There is no need for a Sandbox component to use SVN, and it's easy to
> create a new Commons git repo.
>
> A non-ASF code repo would require code to be checked for license
> compliance etc before it could become a Commons component.
> A Sandbox component does not require that.
>
> > Gary
> >
> > On Thu, May 6, 2021, 11:26 Ralph Goers 
> wrote:
> >
> > >
> > >
> > > > On May 6, 2021, at 8:06 AM, Gary Gregory 
> wrote:
> > > >
> > > > What about the Commons Sandox? Would that be a good place to start?
> > > >
> > >
> > > Emmanuel just sort of proposed doing away with it. As he put it, anyone
> > > can create a
> > > GitHub repo so why does it need to be under the apache user.  He hasn’t
> > > formally
> > > made a proposal for that and I’m not sure how I would vote on it if he
> > > did. He does
> > > have a point. At the same time I’m not sure I’d close off doing
> > > experimental or
> > > early development within the ASF space.
> > >
> > > Ralph
> > >
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Vote] Create repository for "machine learning" algorithms.

2021-05-03 Thread Avijit Basak
Hi Gilles

   Thanks for voting on behalf of me.

Regards
--Avijit Basak

On Mon, 3 May 2021 at 18:14, Gilles Sadowski  wrote:

> Recording a vote in the proper thread on behalf of Avijit Basak (who
> inadequately posted his vote in two other threads).
>
> Le mer. 21 avr. 2021 à 19:05, Gilles Sadowski  a
> écrit :
> >
> > [...]
> >
> > Name of component: "Commons Machine Learning"
> > Name of "git" repository: "commons-machinelearning"
> > Top-level package name: "org.apache.commons.machinelearning"
> >
> > [...]
> >
> >
> > Please vote:
>  [X] Yes.
> >   [ ] No, because ...
>
> Gilles (on behalf on Avijit Basak)
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: The case for a Commons component

2021-05-03 Thread Avijit Basak
Hi

  I would like to vote for *commons-ml*.

Thanks & Regards
--Avijit Basak

On Mon, 3 May 2021 at 04:29, Gilles Sadowski  wrote:

> Hi.
>
> > [... Discussion about GA data-structures...]
>
> I'd suggest that we finalize the [Vote] before getting into the
> details...
>
> Currently, there have been votes by:
>   Emmanuel Bourg (-1)
>   Sebastian Bazley (-0)
>   Ralph Goers (+0)
>   Paul King (+1)
>
> So currently, the discussion should be focused on settling to the
> issues put forward by the opponents to having this new component:
>   * Problem 1: Functionality should go somewhere else (Emmanuel, Sebb)
>   * Problem 2: Who will contribute? (Ralph)
>
> Partial answers have been given.
> We need more opinions (and votes).
>
> Regards,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: The case for a Commons component

2021-05-02 Thread Avijit Basak
Hi

>>Note: You cannot easily just use java.util.BitSet as you wish to
have
access to the underlying long[] to store the chromosome to enable efficient
crossover.
--Thanks for pointing this. However, I have considered few constraints
while doing the implementation.
 1) I extended the existing class AbstractListChromosome, which
requires a Generic type. This is the reason for using a list of Long.
However, I can extend the Chromosome and use an array of primitive long.
BitSet also uses a similar data structure.
 2) One problem of BitSet is the use of MSB to retain bits. As a
result, we won't be able to use the static utility methods of wrapper
classes(Long) for conversion between primitive type and string. We will
have to write custom code for conversion between string and integral types.
This is the only reason I have used BLOCKSIZE as 63 instead of 64.
>>// This is not actually required...
// int bit = cross & 64; // i.e. cross % 64
--Do you mean bit index is not required to calculate? How can we handle
crossover indexes which are not multiple of 64.
>> Do you think that allele sets other than binary would be useful to
implement? [IIUC your document above, it seems not (?).]
--The document only describes the data structure related to Binary
genotype. We already have an implementation of RandomKey genotype in
commons. We can think of adding other genotypes gradually.


Thanks & Regards
--Avijit Basak



On Sat, 1 May 2021 at 22:18, Gilles Sadowski  wrote:

> Le ven. 30 avr. 2021 à 17:40, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >  >>lot of spurious references to "Commons Numbers"
> >  --I have only created the basic project structure. Changes
> > need to be made. Can anyone from the existing commons team help in doing
> > this.
>
> Wel, you should "search and replace":
>   "Numbers" -> "Machine Learning"
>   commons-numbers -> commons-machinelearning
>
> Other things (repository URL, JIRA project name and URL) require that
> a component be created (vote is pending).
> [As long as those files are not part of a PR, it is not urgent to fix
> them.]
>
> >  >> For sure, populate it with the code extracted from CM's
> > "genetics"
> > package and proceed with the enhancements.
> > At first, I'd suggest to refactor the layout of the package (i.e. create
> > a "subpackage" for each component of a genetic algorithm).
> >   -- I am working on it.
>
> Great!
>
> > Did not commit the code till now.
>
> OK.  When you do, please ask for review on the "dev" ML.
>
> >   >>  Then some examination of the data-structures is required (a
> > binary chromosome is currently stored as a "List").
> >   -- I have recently done some work on this. Could you please
> > check this article and share your thought.
> >   "*https://arxiv.org/abs/2103.04751
> > <https://arxiv.org/abs/2103.04751>*"
>
> Alex already provided a thorough response.
> It's a pity that JDK's BitSet is missing a few methods (e.g. "append")
> for a readily usable implementation of a "binary chromosome".
>
> Do you think that allele sets other than binary would be useful to
> implement? [IIUC your document above, it seems not (?).]
>
> >   Are we thinking to use Spark for our parallelism
>
> No, if the code is to reside in Commons.
>
> > or a simple
> > multi-threading of Java.
>
> Yes, we'd depend only on JDK classes.
>
> > I would prefer to use java multi-threading and
> > avoid any other framework.
> >   In java we don't have any library which can be used for AI/ML
> > programming with a very minimal learning curve. Can we think of
> fulfilling
> > this need?
>
> That would be nice. Don't hesitate to enlist fellow programmers. :-)
>
> Regards,
> Gilles
>
> >   This will be helpful for many java developers to venture into
> > AI/ML without learning a new language like Python.
> >
> >
> >>> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Vote] Create a "machine learning" component

2021-04-30 Thread Avijit Basak
Hi

 I would like to vote for *commons-ml*.

Thanks & Regards
--Avijit Basak

On Sat, 24 Apr 2021 at 08:12, Paul King  wrote:

> I added some more comments relevant to if the proposed algorithm
> belongs somewhere in the commons "math" area back in the Jira:
>
> https://issues.apache.org/jira/browse/MATH-1563
>
> Cheers, Paul.
>
> On Wed, Apr 21, 2021 at 7:26 PM Gilles Sadowski 
> wrote:
> >
> > Le mer. 21 avr. 2021 à 08:56, Paul King  a
> écrit :
> > >
> > > On Wed, Apr 21, 2021 at 4:12 PM Ralph Goers <
> ralph.go...@dslextreme.com> wrote:
> > > >
> > > > Why are y’all having a long discussion on Vote thread?
> >
> > Paul King's comments is interesting information that could
> > bear on people's decision on the proposal (especially the
> > licence's issue).
> > As for the question of whether the purported functionality would
> > find a better home elsewhere with the ASF, I'm sure what would
> > be the conclusion (apart from Avijit Bask's plain preference (?) to
> > develop a standalone component, as per Commons' requirement).
> >
> > >
> > > Fair enough. I am +1 (non-binding).
> >
> > So currently, IIRC the tally (on creating a dedicated component) is
> >   Gilles Sadowski +1
> >   Avijit Basak +1
> >   Paul King +1
> > And several -1 on the initially suggested name; but the proposed
> > name has been changed early on to "commons-machinelearning"
> > (in order to comply with Commons' tradition of full words and
> > descriptive names).
> > [Please correct if it doesn't reflect what has been expressed.]
> >
> > Where does that lead us?
> >
> > Regards,
> > Gilles
> >
> > >>> [...]
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: The case for a Commons component

2021-04-30 Thread Avijit Basak
Hi

 >>lot of spurious references to "Commons Numbers"
 --I have only created the basic project structure. Changes
need to be made. Can anyone from the existing commons team help in doing
this.
 >> For sure, populate it with the code extracted from CM's
"genetics"
package and proceed with the enhancements.
At first, I'd suggest to refactor the layout of the package (i.e. create
a "subpackage" for each component of a genetic algorithm).
  -- I am working on it. Did not commit the code till now.
  >>  Then some examination of the data-structures is required (a
binary chromosome is currently stored as a "List").
  -- I have recently done some work on this. Could you please
check this article and share your thought.
  "*https://arxiv.org/abs/2103.04751
<https://arxiv.org/abs/2103.04751>*"

  Are we thinking to use Spark for our parallelism or a simple
multi-threading of Java. I would prefer to use java multi-threading and
avoid any other framework.
  In java we don't have any library which can be used for AI/ML
programming with a very minimal learning curve. Can we think of fulfilling
this need?
  This will be helpful for many java developers to venture into
AI/ML without learning a new language like Python.


Thanks & Regards
--Avijit Basak

On Wed, 28 Apr 2021 at 18:48, Gilles Sadowski  wrote:

> Le lun. 26 avr. 2021 à 16:18, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> > As per previous discussions, I have created a temporary
> repository
> > in GitHub under my personal GitHub Id(avijitbasak). The artifacts have
> been
> > copied from commons-numbers. A preliminary structure has been created for
> > the proposed component.
> > Please let me know if we want to proceed with this format.
>
> There is no source code (and a lot of spurious references to
> "Commons Numbers").
> For sure, populate it with the code extracted from CM's "genetics"
> package and proceed with the enhancements.
> At first, I'd suggest to refactor the layout of the package (i.e. create
> a "subpackage" for each component of a genetic algorithm).
> Then some examination of the data-structures is required (a binary
> chromosome is currently stored as a "List").
> Shouldn't the whole design be revised (based on interfaces and
> streams)?
>
> > We can copy the
> > same to any other team repository if required.
>
> That would be a repository on an ASF server, once the pending vote
> process is completed.  [By the way: You didn't vote...]
>
> Regards,
> Gilles
>
> >> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: The case for a Commons component

2021-04-26 Thread Avijit Basak
Hi

As per previous discussions, I have created a temporary repository
in GitHub under my personal GitHub Id(avijitbasak). The artifacts have been
copied from commons-numbers. A preliminary structure has been created for
the proposed component.
Please let me know if we want to proceed with this format. We can copy the
same to any other team repository if required.

Repo URL: https://github.com/avijitbasak/commons-machinelearning

Thanks & Regards
--Avijit Basak

On Mon, 26 Apr 2021 at 04:49, Paul King  wrote:

> On Mon, Apr 26, 2021 at 12:27 AM sebb  wrote:
> >
> > I assume this thread is about the possible ML component.
> >
> > If the code was developed by Commons, I assume it could be used as
> > part of Spark.
> > However Commons does not currently have many developers who are
> > familiar with the field.
> > So it would seem to me better to have development done by a project
> > which does have relevant experience.
> >
> > You say that Spark etc have lots of jars.
> > Surely that allows for it to be implemented as a separate jar which
> > can either be used as part of the Spark platform, or used
> > independently?
>
> The stats I gave were for the current minimal use of those algorithms.
> Most algorithms are written in Scala, use RDD "dataframes" rather than
> say double arrays, and assume you're running on "the platform" which
> handles how you might get your data and return results and do logging
> etc. in a potentially concurrent world. Some of those design choices
> are key to scaling up but don't align with the goal of making the
> algorithms runnable "independently".
>
> > The only other option I see is for Commons to persuade some developers
> > who are familiar with the field to join Commons to assist with the
> > algorithms.
>
> I agree that is the crux of the issue here. The "commons doesn't have
> the bandwidth to absorb another algorithm" part of the discussion
> seems perfectly legit to me. The "and there is an obvious home
> elsewhere" part of the discussion seemed a little more dubious to me,
> though obviously that is something which should be considered.
>
> > Existing Commons developers can help manage the logistics of packaging
> > and releasing the code, as this does not require in depth knowledge of
> > the design.
> > However this only makes sense if the developers skilled in the are are
> > prepared to assist long-term.
> >
> >
> > On Sat, 24 Apr 2021 at 23:32, Paul King 
> wrote:
> > >
> > > Thanks Gilles,
> > >
> > > I can provide the same sort of stats across a clustering example
> > > across commons-math (KMeans) vs Apache Ignite, Apache Spark and
> > > Rheem/Apache Wayang (incubating) if anyone would find that useful. It
> > > would no doubt lead to similar conclusions.
> > >
> > > Cheers, Paul.
> > >
> > > On Sun, Apr 25, 2021 at 8:15 AM Gilles Sadowski 
> wrote:
> > > >
> > > > Hello Paul.
> > > >
> > > > Le sam. 24 avr. 2021 à 04:42, Paul King 
> a écrit :
> > > > >
> > > > > I added some more comments relevant to if the proposed algorithm
> > > > > belongs somewhere in the commons "math" area back in the Jira:
> > > > >
> > > > > https://issues.apache.org/jira/browse/MATH-1563
> > > >
> > > > Thanks for a "real" user's testimony.
> > > >
> > > > As the ML is still the official forum for such a discussion, I'm
> quoting
> > > > part of your post on JIRA:
> > > > ---CUT---
> > > > For linear regression, taking just one example dataset, commons-math
> > > > is a couple of library calls for a single 2M library and solves the
> > > > problem in 240ms. Both Ignite and Spark involve "firing up the
> > > > platform" and the code is more complex for simple scenarios. Spark
> has
> > > > a 181M footprint across 210 jars and solves the problem in about 20s.
> > > > Ignite has a 87M footprint across 85 jars and solves the problem in >
> > > > 40s. But I can also find more complex scenarios which need to scale
> > > > where Ignite and Spark really come into their own.
> > > > ---CUT---
> > > >
> > > > A similar rationale was behind my developing/using the SOFM
> > > > functionality in the "o.a.c.m.ml.neuralnet" package: I needed a
> > > > proof of concept, and taking the "lightweight" pa

Re: [Vote] Create a "machine learning" component

2021-04-20 Thread Avijit Basak
Hi

  > Did you ask "Spark" people about their opinion about it?
-- Not yet. I am not sure what would be the right option for
this communication. It will be good if you can approach them.
  > where it can be used in real-life (performance-wise)
applications, then you should demonstrate it
-- Do we have any kind of performance benchmark or use case
regarding this? Once that is decided, then I can proceed with this.


Thanks & Regards
--Avijit Basak

On Mon, 19 Apr 2021 at 18:51, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 19 avr. 2021 à 08:35, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> > >Isn't a GA inherently parallel?
> > >If so, why not take advantage of the concurrency tools provided by the
> JDK?
> >   -- Are we planning to implement multi-threading for GA operations even
> as
> > part of a single population
>
> This seems an obvious improvement to our current implementation
> (in case a chromosome's evaluation is not population-dependent).
>
> > or only for multi-population parallel GA.
> >   -- We can implement different types of co-evolution as part of parallel
> > GA. Need to decide on the corresponding strategies we are going to
> > incorporate.
>
> The discussion is still about the "administrative" question of whether
> any of this should be implemented in the "Commons" project...
>
> Did you ask "Spark" people about their opinion about it?
>
> As I said, if you are confident that you can bring our implementation to
> a state where it can be used in real-life (performance-wise) applications,
> then you should demonstrate it (in order to convince other people from
> the Commons PMC that it is worth engaging in long-term maintenance).
> AFAICT, a way to do it would be to create a GitHub project (aimed at
> becoming a new "machine learning" component, or a maven/JPMS
> module within Commons Math).
>
> Best regards,
> Gilles
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Vote] Create a "machine learning" component

2021-04-19 Thread Avijit Basak
Hi

>Isn't a GA inherently parallel?
>If so, why not take advantage of the concurrency tools provided by the JDK?
  -- Are we planning to implement multi-threading for GA operations even as
part of a single population or only for multi-population parallel GA.
  -- We can implement different types of co-evolution as part of parallel
GA. Need to decide on the corresponding strategies we are going to
incorporate.

Thanks & Regards
--Avijit Basak

On Wed, 14 Apr 2021 at 05:53, Gilles Sadowski  wrote:

> Le mar. 13 avr. 2021 à 18:21, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >   Please find my comments below.
> >
> > >> I don't follow the distinction "prod" vs "non-prod".
> >  -- Actually in Prod we really need a very high performing system. So
> > use of implicit parallelism in spark would help us to achieve it. But for
> > other types of work like POC or R we may not need such performance.
>
> Isn't a GA inherently parallel?
> If so, why not take advantage of the concurrency tools provided by the JDK?
>
> > >> the question was actually whether you are willing to modularize CM
> >  -- I am not much aware of other ml components in commons. I would
> look
> > into it.
>
> I've mentioned them in earlier messages:
>  * Self-organizing feature map (artificial neural net)
>  * Clustering
>
> The former is multi-threaded; the latter should be refactored to
> take advantage of multi-threading.
>
> > >>You did not expand about the usability/performance (e.g. the issue of
> > multi-threading)
> >  -- Are we planning to incorporate parallel GA.
>
> Aren't you?
>
> > Then multi-threading
> > would be a more appropriate option.
>
> IMHO, a necessary one.
>
> > >> So, as a way forward, I would suggest that you create a project on
> > GitHub (copying all the settings from a *Commons modular* component,
> such as
> > "Commons Numbers")
> >  -- Could you kindly share the GitHub repository URL for any Commons
> > modular component.
>
> https://github.com/apache/commons-rng
> https://github.com/apache/commons-numbers
> https://github.com/apache/commons-geometry
> https://github.com/apache/commons-statistics
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> >
> > On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski 
> wrote:
> >
> > > Hello.
> > >
> > > Le lun. 12 avr. 2021 à 17:21, Avijit Basak  a
> > > écrit :
> > > >
> > > > Hi
> > > >
> > > >  Sorry for the delayed response. Thanks for your patience.
> Please
> > > > find my comments below:
> > > >
> > > >  (1) Why not Spark?  [At least post over there (?).]
> > > >   --We can move to Spark. But it will be very much useful if the
> > > things
> > > > can also run without Spark. The use of Spark would make more sense
> in a
> > > > production environment. But the portability of the library will be
> more
> > > > useful for the non-prod environment.
> > >
> > > I don't follow the distinction "prod" vs "non-prod".
> > >
> > > > Definitely, we can reach the Spark
> > > > team and query.
> > >
> > > That would be a good idea...
> > >
> > > >  (2) Further develop a monolithic CM?  [Who will do it?]
> > > >--I can help with the upgrade of the existing library related
> to
> > > GA
> > > > functionality.
> > >
> > > Sure, but nobody is currently working on (2).
> > >
> > > >  (3) Modularize CM? [Who will do it?]
> > > >--I can help with the upgrade of the existing library related
> to
> > > GA
> > > > functionality.
> > >
> > > I don't doubt it; but the question was actually whether you are willing
> > > to modularize CM (that is: in addition to, and before, contributing to
> > > the GA functionality).
> > >
> > > >  (4) New component (with another name) with the proposed contents?
> > > >--This is the best option if permitted.
> > >
> > > Currently, only the two of us are in favour of this alternative.
> > >
> > > Nobody, by their action, is really in favour of any of the other
> > > alternatives.
> > > So, as a way forward, I would suggest that you create a project on
> GitHub
> > > (copying all the settings from a Commons modular component, such as
> > > "Commons Numbers"), to be eventually integrated here, once its
> potential
> > > has been demonstrated.
> > >
> > > >   The code which I have written can be reused with minor
> > > modifications.
> > > > So it won't take too much effort for this activity.
> > >
> > > You did not expand about the usability/performance (e.g. the issue of
> > > multi-threading)...
> > >
> > > Regards,
> > > Gilles
> > >
> > > >> [...]
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Vote] Create a "machine learning" component

2021-04-13 Thread Avijit Basak
Hi

  Please find my comments below.

>> I don't follow the distinction "prod" vs "non-prod".
 -- Actually in Prod we really need a very high performing system. So
use of implicit parallelism in spark would help us to achieve it. But for
other types of work like POC or R we may not need such performance.
>> the question was actually whether you are willing to modularize CM
 -- I am not much aware of other ml components in commons. I would look
into it.
>>You did not expand about the usability/performance (e.g. the issue of
multi-threading)
 -- Are we planning to incorporate parallel GA. Then multi-threading
would be a more appropriate option.
>> So, as a way forward, I would suggest that you create a project on
GitHub (copying all the settings from a *Commons modular* component, such as
"Commons Numbers")
 -- Could you kindly share the GitHub repository URL for any Commons
modular component.

Thanks & Regards
--Avijit Basak


On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski  wrote:

> Hello.
>
> Le lun. 12 avr. 2021 à 17:21, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >  Sorry for the delayed response. Thanks for your patience. Please
> > find my comments below:
> >
> >  (1) Why not Spark?  [At least post over there (?).]
> >   --We can move to Spark. But it will be very much useful if the
> things
> > can also run without Spark. The use of Spark would make more sense in a
> > production environment. But the portability of the library will be more
> > useful for the non-prod environment.
>
> I don't follow the distinction "prod" vs "non-prod".
>
> > Definitely, we can reach the Spark
> > team and query.
>
> That would be a good idea...
>
> >  (2) Further develop a monolithic CM?  [Who will do it?]
> >--I can help with the upgrade of the existing library related to
> GA
> > functionality.
>
> Sure, but nobody is currently working on (2).
>
> >  (3) Modularize CM? [Who will do it?]
> >--I can help with the upgrade of the existing library related to
> GA
> > functionality.
>
> I don't doubt it; but the question was actually whether you are willing
> to modularize CM (that is: in addition to, and before, contributing to
> the GA functionality).
>
> >  (4) New component (with another name) with the proposed contents?
> >--This is the best option if permitted.
>
> Currently, only the two of us are in favour of this alternative.
>
> Nobody, by their action, is really in favour of any of the other
> alternatives.
> So, as a way forward, I would suggest that you create a project on GitHub
> (copying all the settings from a Commons modular component, such as
> "Commons Numbers"), to be eventually integrated here, once its potential
> has been demonstrated.
>
> >   The code which I have written can be reused with minor
> modifications.
> > So it won't take too much effort for this activity.
>
> You did not expand about the usability/performance (e.g. the issue of
> multi-threading)...
>
> Regards,
> Gilles
>
> >> [...]
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [Vote] Create a "machine learning" component

2021-04-12 Thread Avijit Basak
Hi

 Sorry for the delayed response. Thanks for your patience. Please
find my comments below:

 (1) Why not Spark?  [At least post over there (?).]
  --We can move to Spark. But it will be very much useful if the things
can also run without Spark. The use of Spark would make more sense in a
production environment. But the portability of the library will be more
useful for the non-prod environment. Definitely, we can reach the Spark
team and query.
 (2) Further develop a monolithic CM?  [Who will do it?]
   --I can help with the upgrade of the existing library related to GA
functionality.
 (3) Modularize CM? [Who will do it?]
   --I can help with the upgrade of the existing library related to GA
functionality.
 (4) New component (with another name) with the proposed contents?
   --This is the best option if permitted.

  The code which I have written can be reused with minor modifications.
So it won't take too much effort for this activity.
  Kindly share further thoughts.

Thanks & Regards
--Avijit Basak


On Sun, 14 Feb 2021 at 19:56, Gilles Sadowski  wrote:

> Le dim. 14 févr. 2021 à 09:06, Avijit Basak  a
> écrit :
> >
> > Hi
> >
> >I would like to mention a few points here. Genetic Algorithm has a
> > vast range of applications in optimization and search problems. Machine
> > learning is only one of those.
> >If we couple the new GA library with any specific domain like ml
> it
> > would be meaningless for people working in other domains.
>
> Isn't "meaningless" a slight overstatement?
> We might have an issue of terminology: There is no necessary "coupling"
> but maybe "acquaintance" (for lack of a better word), as a set of tools
> that
> might come in handy for solving certain types of problems.  [For example,
> the Traveling Salesman Problem can be tackled by GA and SOFM, both
> of which are candidate for inclusion in the new component, although they
> don't share any code.]
>
> If the name "machine learning" is not the most appropriate one to convey
> the intended scope, do you have another idea?
> ["AI" would perhaps be more correct if we consider a strict hierarchy, but
> would obviously be far too presumptuous.]
>
> > They have to
> > incorporate the entire ml library
>
> No, they won't.  Given the stated goal of "modularity": the "ga" module
> will be available as a dedicated JAR (possibly with a dependency to
> codes that can be reused in other modules provided by the component).
>
> > which may be completely unrelated to
> > their project. Coupling it with any technology like spark might also
> limit
> > it's usability.
>
> You may be right; I have no idea about the "restrictions" imposed by
> Spark.  [It seems that in this case, one would have to indeed depend
> on Spark's "mllib" (?).  This would be one reason, as I already stated,
> for having something in "Commons".]
>
> Could you elaborate on a concrete use-case where one would be
> starting to develop an application with the specific requirement that
> Spark could not be used?
> In particular, IIRC Spark has multi-threading built in.  Don't you see
> it as a huge problem that CM would not provide such a feature?
>
> >If a separate component is not approved for this change then we
> can
> > incorporate the changes as part of *commons.math* library.
>
> Of course, if somebody wants to do that, he's welcome.
> [That will not be me, for all the reasons which I've explained.  In the
> last
> 5 years I've been pretty much alone in handling bug reports about CM;
> I'm unwilling to assume implicit support for even more codes.]
>
> Also, with this solution, you'd now be willing to accept what you weren't
> above: Anyone wanting to use the GA functionality would indeed have to
> "incorporate" the whole of "Commons Math" (CM).
> Of course, the latter could be modularized, but this will only mitigate the
> issue, as any release of the GA functionality will potentially be then held
> off by potential issues in other parts of CM (which nobody has been able
> to consistently support for more than 5 years now).
>
> >The same library can be reused in ml or neural network libraries
> as
> > a dependency.
>
> It is the other way around:  The development version of CM currently
> depends on "lower-level" components.
> Furthermore, right now its (embryonic) "machine learning" functionality
> hasn't any substantial dependency on codes outside the "o.a.c.math4.ml"
> package.
>
> >Kindly share further views on this.
>
> In summary, to be cla

Re: [Vote] Create a "machine learning" component

2021-02-14 Thread Avijit Basak
Hi

   I would like to mention a few points here. Genetic Algorithm has a
vast range of applications in optimization and search problems. Machine
learning is only one of those.
   If we couple the new GA library with any specific domain like ml it
would be meaningless for people working in other domains. They have to
incorporate the entire ml library which may be completely unrelated to
their project. Coupling it with any technology like spark might also limit
it's usability.
   If a separate component is not approved for this change then we can
incorporate the changes as part of *commons.math* library.
   The same library can be reused in ml or neural network libraries as
a dependency.
   Kindly share further views on this.

Thanks & Regards
--Avijit Basak

On Wed, 10 Feb 2021 at 19:49, Gilles Sadowski  wrote:

> Le mer. 10 févr. 2021 à 13:19, sebb  a écrit :
> >
> > Likewise, commons-ml is too cryptic.
> >
> > Also, the Spark project has a machine-learning library:
> >
> > https://spark.apache.org/mllib/
>
> Thanks for the pointer.
>
> >
> > Maybe that would be better home?
>
> On the face of it, probably.
> [For sure, Avijit should comment on the suggestion.]
>
> On the other hand, "Commons" is the place where one can pick "bare
> bone" implementations, and add the functionality to one's application
> without necessarily comply with an overarching framework.
> [I don't mean that framework compliance is bad; quite the contrary, it is
> hopefully the result of a thorough reflection by experts.  But ... cf. the
> numerous "no-dependency" discussions ...]
>
> Actually, concerning Avijit's proposed contribution, didn't I say:[1]
> ---CUT---
> Thus, I think that we must assess whether the "genetic algorithms"
> functionality has a reasonable future within "Apache Commons" (i.e.
> potential users and contributors) while there exist other libraries that
> seem much more advanced for any serious usage.
> ---CUT---
>
> > I'm also a bit concerned as to whether there are sufficient developers
> > here with knowledge of the ML domain to be able to support the code in
> > the future.
>
> An interesting point; by all means not a new one (see e.g. [2]).
>
> Isn't it the same point I've been making about "Commons Math" (CM)?
> There has been no releases because nobody here is able (or is willing
> to) support it.
>
> Concerning the support of the purported "machinelearning" component:
> 1. Package
> org.apache.commons.math4.ml.neuralnet
> * I've written it entirely and I have applications that depend on it
> (and I
>   cannot assume that I could easily switch to, or port it to, Spark),
> so I
>   can reasonably ensure that it would be supported.
> 2. Package
> org.apache.commons.math4.ml.clustering
> * Functionality is mentioned in Spark's "mllib" user guide.
> * When a new feature was last contributed[3], it was noticed[4][5][6]
>   that improvement were needed (but there was no follow-up).
> * I've an application that depend on it (from CM v3.6.1) but I wouldn't
>   support it if shipped in CM v4.0.
> 3. Package
> org.apache.commons.math4.genetics
> * Part of my "end-of-study" project consisted in a GA implementation.
>   I've never used the CM implementation, and I don't deny that there
>   could be perfectly fine uses of it but, just looking at the code, it
> seems
>   obvious that it cannot compete feature-wise with other libraries
> out there.
> * I've suggested long ago that, without anyone supporting it actively
> (and
>   no known user community), it should be dropped from CM.
> * Avijit expressed a willingness to improve the functionality:  Is
> this enough
>   for the PMC to create a new component?  From the experience with the
>   "clustering" package mentioned above, I'd tend to think
> (unfortunately)
>   that it isn't.  He should first explore whether the Spark community
> is
>   interested, that the GA functionality be moved over there.
>
> Gilles
>
> [1] https://issues.apache.org/jira/browse/MATH-1563
> [2] https://markmail.org/message/26yxj5vhysdsoety
> [3] https://issues.apache.org/jira/projects/MATH/issues/MATH-1509
> [4] https://issues.apache.org/jira/projects/MATH/issues/MATH-1524
> [5] https://issues.apache.org/jira/projects/MATH/issues/MATH-1528
> [6] https://issues.apache.org/jira/projects/MATH/issues/MATH-1526
>
> >
> > On Wed, 10 Feb 2021 at 08:27, Emmanuel Bourg  wrote:
> > >
> > > -1 for commons-ml for the same reasons.
> > >
> > > What a

Re: [All][Math] New GA component

2021-01-29 Thread Avijit Basak
Hello Gilles

 Thanks for your reply. Actually I am not very comfortable with the
porting process. It will be really nice if I can have an initial repository.

Thanks & Regards
--Avijit Basak

On Wed, 20 Jan 2021 at 17:50, Gilles Sadowski  wrote:

> Hello.
>
> Le mer. 20 janv. 2021 à 11:11, Avijit Basak  a
> écrit :
> >
> > Hello Gilles Sadowski
> >
> >  Thanks for your reply. Yes I intend to contribute to enhancement
> > of the GA functionality as per the JIRA (MATH-1563) proposal.
>
> My proposal was to first create a new component (and, thus, implement
> the enhancement over there).
> Do you agree to perform the port?  As said in the previous message, this
> should be relatively easy, but will require populating a new "git"
> repository,
> using a recent and similar project's (e.g. "Commons Numbers") files as
> templates.
>
> > If I find any
> > other changes suitable I would also propose the same. Could you kindly
> look
> > into the approval process for this JIRA.
>
> There is no "process" other than the discussions taking place here, on
> the "dev" ML.
>
> Regards,
> Gilles
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> > On Wed, 20 Jan 2021 at 04:11, Gilles Sadowski 
> wrote:
> >
> > > Hi Avijit.
> > >
> > > [I've changed the "Subject:" line.]
> > >
> > > Le mar. 19 janv. 2021 à 08:31, Avijit Basak  a
> > > écrit :
> > > >
> > > > Hello Gilles Sadowski
> > > >
> > > >  I have extended the current implementation of Genetic
> Algorithm
> > > in a.c.m package and made the probability generation process adaptive.
> A
> > > significant improvement of performance was observed because of this.
> The
> > > current version of implementation in a.c.m.GA incorporates simple
> genetic
> > > algorithm which is not much efficient and useful. However I have
> extended
> > > the same framework to incorporate the enhancement as part of my work.
> > > However the library can also be extended to incorporate other advanced
> > > concepts of Genetic Programming.
> > >
> > > Do you intend to do, or otherwise further contribute to the enhancement
> > > of the GA functionality?
> > >
> > > >  To compare with other libraries I have chosen a.c.m because
> of
> > > it's flexible and extensible design.
> > >
> > > That's good news, despite we never had much feedback about that code
> > > base...
> > >
> > > >  This is to be decided if we need a new component or extend
> the
> > > same component.
> > >
> > > The functionality in package "o.a.c.m.genetics" does not depend on
> > > functionality
> > > in other packages (except for exceptions).  Setting up a new component
> > > would
> > > thus be very easy.
> > > Doing so will bring the same maintenance advantage as we have witnessed
> > > with
> > > the other Commons Math spin-offs.
> > >
> > > Regards,
> > > Gilles
> > >
> > > >> [...]
> > >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: [All][Math] New GA component

2021-01-20 Thread Avijit Basak
Hello Gilles Sadowski

 Thanks for your reply. Yes I intend to contribute to enhancement
of the GA functionality as per the JIRA (MATH-1563) proposal. If I find any
other changes suitable I would also propose the same. Could you kindly look
into the approval process for this JIRA.

Thanks & Regards
--Avijit Basak

On Wed, 20 Jan 2021 at 04:11, Gilles Sadowski  wrote:

> Hi Avijit.
>
> [I've changed the "Subject:" line.]
>
> Le mar. 19 janv. 2021 à 08:31, Avijit Basak  a
> écrit :
> >
> > Hello Gilles Sadowski
> >
> >  I have extended the current implementation of Genetic Algorithm
> in a.c.m package and made the probability generation process adaptive. A
> significant improvement of performance was observed because of this. The
> current version of implementation in a.c.m.GA incorporates simple genetic
> algorithm which is not much efficient and useful. However I have extended
> the same framework to incorporate the enhancement as part of my work.
> However the library can also be extended to incorporate other advanced
> concepts of Genetic Programming.
>
> Do you intend to do, or otherwise further contribute to the enhancement
> of the GA functionality?
>
> >  To compare with other libraries I have chosen a.c.m because of
> it's flexible and extensible design.
>
> That's good news, despite we never had much feedback about that code
> base...
>
> >  This is to be decided if we need a new component or extend the
> same component.
>
> The functionality in package "o.a.c.m.genetics" does not depend on
> functionality
> in other packages (except for exceptions).  Setting up a new component
> would
> thus be very easy.
> Doing so will bring the same maintenance advantage as we have witnessed
> with
> the other Commons Math spin-offs.
>
> Regards,
> Gilles
>
> >> [...]
>
> -----
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-- 
Avijit Basak


Re: Contributor License Agreement

2021-01-18 Thread Avijit Basak
Hello Gilles Sadowski

 I have extended the current implementation of Genetic Algorithm in
a.c.m package and made the probability generation process adaptive. A
significant improvement of performance was observed because of this. The
current version of implementation in a.c.m.GA incorporates simple genetic
algorithm which is not much efficient and useful. However I have extended
the same framework to incorporate the enhancement as part of my work.
However the library can also be extended to incorporate other advanced
concepts of Genetic Programming.
 To compare with other libraries I have chosen a.c.m because of
it's flexible and extensible design.
 This is to be decided if we need a new component or extend the
same component.
 Kindly share your thoughts on this.

Thanks & Regards
--Avijit Basak


On Mon, 18 Jan 2021 at 23:21, Gilles Sadowski  wrote:

> Hi.
>
> Le lun. 18 janv. 2021 à 17:56, Avijit Basak  a
> écrit :
> >
> > Hello
> >
> >  I would like to inform you that I am interested in contributing
> to
> > the Apache Commons Maths project. A JIRA (*MATH-1563*) was created with
> the
> > respective proposal. Kindly grant me the required access for the same. I
> > would like to use my github Id *'avijitbasak'* for this contribution.
> > Kindly let me know if any further information is required.
>
> I tried to get some discussion started on the "dev" ML:
> https://markmail.org/message/p7gkatll4dvdlcdd
>
> Your opinion is certainly welcome...
>
> Regards,
> Gilles
>
> >
> > Thanks & Regards
> > --Avijit Basak
> >
> > -- Forwarded message -
> > From: Matt Sicker 
> > Date: Mon, 4 Jan 2021 at 21:28
> > Subject: Re: Contributor License Agreement
> > To: Avijit Basak 
> > Cc: 
> >
> >
> > Dear Avijit Basak,
> >
> > This message acknowledges receipt of your ICLA, which has been filed in
> the
> > Apache Software Foundation records.
> >
> > With this message, the Commons PMC has been notified that your ICLA has
> > been filed.
> >
> > ** Please contact the Apache Commons PMC with any further questions, not
> > the Secretary. Thanks. **
> >
> > If you have been invited as a committer, please provide the Apache
> Commons
> > PMC (copied) with your preferred Apache id.
> >
> > The id must not already be in use. See
> > https://people.apache.org/committer-index.html
> > Note that some existing ids include '-' and '_'. These characters are no
> > longer permitted in ids.
> >
> > The id must consist of lowercase alphanumeric characters only, starting
> > with an alphabetic character.
> > Minimum length 3 characters. No special characters.
> >
> > Warm Regards,
> >
> > --
> > Matt Sicker
> > Secretary, Apache Software Foundation
> >
> >
> >
> > --
> > Avijit Basak
>


-- 
Avijit Basak


[MATH] A Proposal for Implementation of Adaptive Probability Generation Strategy for Genetic Algorithm

2020-12-18 Thread Avijit Basak
Hi All

I would like to propose incorporation of adaptive probability
generation strategy for Genetic Algorithm implementation of apache commons
maths library.
Currently Apache's API works on constant probability strategy. I
have done some work on the adaptive approach and published in this article "
https://www.ijcaonline.org/archives/volume175/number10/basak-2020-ijca-920572.pdf
".
I have created a JIRA "MATH-1563" to describe the same.
Kindly let me know your views on the same.

Thanks & Regards
-- Avijit Basak