Hi All Please see my comments below.
>> Please find my comments. >> >> >> (2) >> >> >The "GeneticException" class seems to mostly deal with "illegal" >> >> >arguments; hence it should be a subclass of the JDK's standard >> >> >"IllegalArgumentException" (and be renamed accordingly). >[...] >> > >> -- I shall make the changes. What could be the name for the Exception. >> GeneticIllegalArgumentException would be fine? > >I guess so. > >The fool-proof, preferred approach, is to have package-private >exception classes. > >Another suggestion would be to create a container class (within >a package duly marked as "internal, not part of the public API"): >---CUT--- >public GeneticException { > > public static class IllegalArgument extends IllegalArgumentException { > // ... > } > > // etc. >} >---CUT--- > >The important thing is that the "@throws" Javadoc tags only >document the base classes (i.e. the JDK's standard exception). > --I have modified the exception class name to GeneticIllegalArgumentException. >> >> >> >> >[Exception messages need review for spelling and formatting.] >> >> -- It will be really helpful if you can point out some specific examples. >> > >> >We can fix this when the PR has reached some stability. >> > >> >> >> >> (3) >> >> >IMO Javadoc should avoid redundant phrases like "This class" as >> >> >the first words of a class description. >> >> --Refractored the javadoc comments. Please review and mention if you need >> >> any further changes. >> > >> >I've not looked yet, but thanks for taking it into account. >> >Similarly to the previous point, these clean-ups can happen later. >> > >> >> >A similar remark holds for fields in "GeneticException" class: Since >> >> >the name of the field is self-documenting, duplication in the Javadoc >> >> >is visual noise ("Message template" is concise and clear enough). >> >> --Removal of the javadoc comments produces a checkstyle error. >> > >> >I did not suggest to remove any Javadoc, only to rephrase it as: >> >---CUT--- >> >/** "Message template". */ >> >---CUT--- >> > >> -- Just for confirmation. Do you need the comment to be changed from >> ---CUT--- >> /** Error message for "out of range" condition. */ >> ---CUT--- >> *to* >> ---CUT--- >> /** "out of range" */ >> ---CUT--- > >No, sorry, I just meant that e.g. >---CUT--- >/** Message template. */ >private static final String OUT_OF_RANGE = "Out of range: min={0} max={1}"; >---CUT--- >is sufficient. Since the field is private, the comment will mostly ever >be read while looking at the source code. --I think we have a misunderstanding here. The field is public, not private. Will a hard coded "Message Template" comment be sufficient? >> JDK(*ProviderBuilder.RandomSourceInternal.JDK*), >> ... >> } >> ---CUT--- >> I am not sure how to achieve this for our algorithm classes. > >I'm a bit lost here; I don't get the relationship of this with my >remark above (constant vs adaptive)... Maybe you could file >JIRA report to clarify and further discussion? --Let me try to clarify my original understanding. I think the proposal is to introduce an Enum and declare each type of algorithm classes as instances of enum. So it might follow this template: --CUT-- enum GeneticAlgorithms { SGA([args...]),//Simple genetic algorithm equivalent to GeneticAlgorithm class AGA([args...]) //Adaptive genetic algorithm. } --CUT-- Where [args...] are selection, crossover, mutation operators and probalities. As the AdaptiveGeneticAlgorithm is a generalization of GeneticAlgorithm it would be possible to represent the later by former using constant crossover and mutation rate generators. But the problem lies in a different area. In enumeration the required arguments ([args...]) need to be specified during the declaration using constant values. But for our implementation those arguments need to be provided by the client programs. I don't know how to achieve that using enum. Let me know if you have any different views regarding this. > >> > >> >> >> >> (7) >> >> >The currently available GA implementations are sequential. >> >> >IIUC, the "nextGeneration" methods should provide an option >> >> >that processes the population using multiple threads. >> >> --This needs to be done. However, I would like to address this along >> with >> >> parallel GA i.e. convergence of multiple populations together. >> > >> >The two features (multi-thread vs multiple populations) should >> >be implemented independently: Users that only need the "basic" >> >GA should also be able to take advantage of their machine's >> >multiple CPUs. >> >[This is related to the design issue which I mentioned previously.] >> > >> -- I am thinking to leverage user's multiple CPUs for doing >> multi-population GA. > >OK (sort-of, since "the devil is in the details", and I'm not sure >that we mean the same thing by "multi", see below). > >> It would a global approach where same thread pool >> would be used for both purposes. Another class would be introduced for >> executing parallel genetic algorithm which would accept multiple instances >> of AbstractGeneticAlgorithm class and converge them in parallel. Users who >> does not care for robustness would go for current implementations of the >> algorithm with single population. For a better optimization quality users >> would chose the new class. > >As hinted by my comment is the previous message, I've still to >clarify my own expectations; but I vaguely sense some lost >opportunity for simpler usage simpler and increased performance >through the caller just needs to specify the number of "worker >threads". -- We should have both options. Users can execute the algorithm by specifying only the number of worker threads with a single population as well as optimize multiple populations in a parallel fashion. For the multi-population option the common thread pool would be reused for all populations. > >Do we at least agree that >1. Adding/retrieving a "Chromosome" to/from a "Population" >must be thread-safe (and is not trivial) >2. Fitness computation is where most time is usually spent >(so that multi-threading must be achieved at that granularity) >? --The way I am thinking of designing a task is that it should accept the current population and return an instance of ChromosomePair. The chromosomes within this pair would be added to the population by the caller thread. Population won't be updated by multiple threads. The code snippet below shows the body of the method which will be executed inside the task. --CUT-- //selection ChromosomePair pair = getSelectionPolicy().select(current); // crossover if (randGen.nextDouble() < getCrossoverRate()) { // apply crossover policy to create two offspring pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond()); } // mutation if (randGen.nextDouble() < getMutationRate()) { // apply mutation policy to the chromosomes pair = new ChromosomePair( getMutationPolicy().mutate(pair.getFirst()), getMutationPolicy().mutate(pair.getSecond())); } return pair; --CUT-- > >I'd surmise that "multiple instances of AbstractGeneticAlgorithm" >is an application concern; unless I'm missing something, it's >not what I've in mind when talking about multi-threading. >Actually, I was wondering whether we could implement the >analog of what is in the "commons-math-neuralnet" module, >where >* "Neuron" is the counterpart "Chromosome" >* "Network" is the counterpart of "Population". --"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA with multiple populations not multi-threading. Users can also implement parallel GA in a synchronous manner although that won't be a recommended way. Multi-threading is only a way to improve performance using a user's multi core CPU. The threads in the thread pool would only be used to execute the task as mentioned in the previous comment. I think we have some misunderstanding over here. It is better to do an implementation first and start the discussion. It would be more productive. Thanks & Regards --Avijit Basak On Thu, 17 Feb 2022 at 01:09, Gilles Sadowski <gillese...@gmail.com> wrote: > Hello. > > Le mer. 16 févr. 2022 à 17:31, Avijit Basak <avijit.ba...@gmail.com> a > écrit : > > > > Hi All > > > > Please find my comments. > > > > >> (2) > > >> >The "GeneticException" class seems to mostly deal with "illegal" > > >> >arguments; hence it should be a subclass of the JDK's standard > > >> >"IllegalArgumentException" (and be renamed accordingly). > > >> >If other condition types are needed, then another internal class > > >> >should be defined with the corresponding standard semantics. > > >> --IMHO if we think of a single exception class we should extend it > only > > >> from RuntimeException. > > > > >"single exception class" is not a requirement (it cannot be since > > >we agreed some time ago that it was better to align with the JDK's > > >delineation of error conditions (IAE, NPE, ILSE, AE, ...). > > > > >> If we think of multiple exception classes in one > > >> module we may need to think of a base exception class. Other classes > > would > > >> extend the same. > > > > > >Please no. We'd taken that approach in "Commons Math" (cf. > > >base class now in module "commons-math-legacy-exception"), > > >as I've mentioned already IIRC: It was a failed experiment IMO. > > >[For more details, please refer to the archive of the "dev" ML.] > > > > > >> The approach mentioned above would mix up these two. > > >> Please share your opinion regarding this. > > > > > >Eventually, all new components ([RNG], [Number], [Geometry], ...) > > >adopted the simple approach of non-public API (ideally private > > >or package-private) exception classes only for the developer's > > >use (and the purpose of which is limited to avoiding duplication). > > > > > -- I shall make the changes. What could be the name for the Exception. > > GeneticIllegalArgumentException would be fine? > > I guess so. > > The fool-proof, preferred approach, is to have package-private > exception classes. > > Another suggestion would be to create a container class (within > a package duly marked as "internal, not part of the public API"): > ---CUT--- > public GeneticException { > > public static class IllegalArgument extends IllegalArgumentException { > // ... > } > > // etc. > } > ---CUT--- > > The important thing is that the "@throws" Javadoc tags only > document the base classes (i.e. the JDK's standard exception). > > > >> > > >> >[Exception messages need review for spelling and formatting.] > > >> -- It will be really helpful if you can point out some specific > examples. > > > > > >We can fix this when the PR has reached some stability. > > > > > >> > > >> (3) > > >> >IMO Javadoc should avoid redundant phrases like "This class" as > > >> >the first words of a class description. > > >> --Refractored the javadoc comments. Please review and mention if you > need > > >> any further changes. > > > > > >I've not looked yet, but thanks for taking it into account. > > >Similarly to the previous point, these clean-ups can happen later. > > > > > >> >A similar remark holds for fields in "GeneticException" class: Since > > >> >the name of the field is self-documenting, duplication in the Javadoc > > >> >is visual noise ("Message template" is concise and clear enough). > > >> --Removal of the javadoc comments produces a checkstyle error. > > > > > >I did not suggest to remove any Javadoc, only to rephrase it as: > > >---CUT--- > > >/** "Message template". */ > > >---CUT--- > > > > > -- Just for confirmation. Do you need the comment to be changed from > > ---CUT--- > > /** Error message for "out of range" condition. */ > > ---CUT--- > > *to* > > ---CUT--- > > /** "out of range" */ > > ---CUT--- > > No, sorry, I just meant that e.g. > ---CUT--- > /** Message template. */ > private static final String OUT_OF_RANGE = "Out of range: min={0} max={1}"; > ---CUT--- > is sufficient. Since the field is private, the comment will mostly ever > be read while looking at the source code. > > > >> (6) > > >> >Why support inheritance for "AbstractGeneticAlgorithm"? > > >> >Why would users need their own subclass, rather than call those > > >> >implemented within the library (currently, "GeneticAlgorithm" and > > >> >"AdaptiveGeneticAlgorithm")? > > >> >Couldn't we encapsulate the choice of algorithm in an "enum", > > >> >similar to "RandomSource" in [RNG]. > > >> >Do I understand correctly that the (only?) difference between the > > >> >two classes is the ability to adapt crossover and mutation rates? > > >> -- The difference between GeneticAlgorithm and > AdaptiveGeneticAlgorithm > > is > > >> the ability to adapt crossover and mutation probability. However, as > per > > >> my understanding enum encapsulation is appropriate with the same set > and > > >> type of constructor arguments, where the arguments can be provided > during > > >> enum declaration. In our case the arguments would be provided by the > > client > > >> program and cannot be pre-initialized as part of an enum declaration. > > > > > >Why not? > > >A constant rate seems like a (trivial) type of adaptive rate. > > > > -- Our algorithm classes accept various user provided arguments like > > crossover, mutation and selection operators. Enum declaration requires > all > > of the arguments to be provided at the declaration time in the class > itself > > like > > ---CUT--- > > public enum RandomSource { > > JDK(*ProviderBuilder.RandomSourceInternal.JDK*), > > ... > > } > > ---CUT--- > > I am not sure how to achieve this for our algorithm classes. > > I'm a bit lost here; I don't get the relationship of this with my > remark above (constant vs adaptive)... Maybe you could file > JIRA report to clarify and further discussion? > > > > > > >> > > >> (7) > > >> >The currently available GA implementations are sequential. > > >> >IIUC, the "nextGeneration" methods should provide an option > > >> >that processes the population using multiple threads. > > >> --This needs to be done. However, I would like to address this along > > with > > >> parallel GA i.e. convergence of multiple populations together. > > > > > >The two features (multi-thread vs multiple populations) should > > >be implemented independently: Users that only need the "basic" > > >GA should also be able to take advantage of their machine's > > >multiple CPUs. > > >[This is related to the design issue which I mentioned previously.] > > > > > -- I am thinking to leverage user's multiple CPUs for doing > > multi-population GA. > > OK (sort-of, since "the devil is in the details", and I'm not sure > that we mean the same thing by "multi", see below). > > > It would a global approach where same thread pool > > would be used for both purposes. Another class would be introduced for > > executing parallel genetic algorithm which would accept multiple > instances > > of AbstractGeneticAlgorithm class and converge them in parallel. Users > who > > does not care for robustness would go for current implementations of the > > algorithm with single population. For a better optimization quality users > > would chose the new class. > > As hinted by my comment is the previous message, I've still to > clarify my own expectations; but I vaguely sense some lost > opportunity for simpler usage simpler and increased performance > through the caller just needs to specify the number of "worker > threads". > > Do we at least agree that > 1. Adding/retrieving a "Chromosome" to/from a "Population" > must be thread-safe (and is not trivial) > 2. Fitness computation is where most time is usually spent > (so that multi-threading must be achieved at that granularity) > ? > > I'd surmise that "multiple instances of AbstractGeneticAlgorithm" > is an application concern; unless I'm missing something, it's > not what I've in mind when talking about multi-threading. > Actually, I was wondering whether we could implement the > analog of what is in the "commons-math-neuralnet" module, > where > * "Neuron" is the counterpart "Chromosome" > * "Network" is the counterpart of "Population". > > Regards, > Gilles > > >>> [...] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >