Hi All Please see my comments below.
>> >Several problems with this approach (raised in previous messages IIRC): >> >1. Potential performance loss in sharing the same RNG instance. >> -- As per my understanding ThreadLocalRandomSource creates separate >> instances of UniformRandomProvider for each thread. So I am not sure how a >> UniformRandomProvider instance is being shared. Please correct me if I am >> wrong. >Within a given thread there will be *one* RNG instance; that's what I meant >by "shared". >Of course you are right that that instance is not shared by multiple threads >(which would be a bug). >The performance loss is because it will be necessary to call > ThreadLocalRandomSource.current(RandomSource source) >for each access to the RNG (since it would be a bug to store the returned >value in e.g. an operator instance that would be shared among threads (as >you suggest below). -- I tried to do a small test on it and here are the results. Output times are in milliseconds. According to my understanding the performance loss is mostly during creation of per thread instance of UniformRandomProvider. --*CUT*-- @Test void test() { int limit = 1; long start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 1000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 10000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 100000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 1000000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 10000000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 100000000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); limit = 1000000000; start = System.currentTimeMillis(); for (int i = 0; i < limit; i++) { ThreadLocalRandomSource.current(RandomSource.JDK); } System.out.println(System.currentTimeMillis() - start); } --*CUT*-- --*output*-- 363 1 2 4 6 28 244 2423 --*output*-- >> >2. Less/no flexibility (no user's choice of random source). >> -- Agreed. -- Do we really need this much flexibility here? >> >3. Error-prone (user can access/reuse the "UniformRandomProvider" >> instances). >> >> >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct but >> >"light" usage of random number generation in a multi-threaded application; >> GAs >> >make "heavy" use of RNG, thus it is does not seem outlandish that all the >> RNG >> >"clients" (e.g. every "operator") creates their own instances. > > >> >IMHO, a more important discussion would be about the expectations in a >> >multithreaded context: E.g. should an operator be shareable by different >> >threads? And if not, how does the API help application developers to avoid >> >such pitfalls? >> -- Once we implement multi-threading in GA, same crossover and mutation >> operators will be re-used across multiple threads. >I would be wary to go on that path; better consider making (deep) copies. >We can have multiple instances of an operator, all being configured in the >same way but being different instances with no risk of a multithreading bug. -- I don't think this would be a good design choice just to support customization of RNG functionality. This will lead to too many instances of the same operators resulting in lots of unnecessary memory consumption. I think we might face memory issues for higher dimensional problems. As population size requirement also increases with increase of dimension this might lead to a major issue and need a thought. So I think we have a design tradeoff here performance vs memory consumption. I am more worried about memory as that might restrict use of this library beyond a certain number of dimensions in some areas. However, creating deep copy would only be possible when we strictly restrict extension of operators which I want to avoid. >> So even if we provide >> the customization at the operator level we cannot avoid sharing. >We can, and we should. >What we probably can't avoid sharing is the instance that represents the >population of chromosomes. *--* In a multi-threaded optimization the chromosome instances are shared in case the same chromosome is chosen for crossover by the selection process. I missed this point earlier. ... >> > Mine is against using "ThreadLocalRandomSource"... >> -- What is the wayout other than that. Please suggest. >I think I did. *--* The factory based approach would be useful only when we can have separate copies of operators for each set of operations. >Maybe it's time to create a dedicated branch for the GA functionality >so that we can try out the different approaches. > > >> I think first we need to decide on whether we really need this > >> customization and if yes then why. Then we can decide on alternate > >> implementation options. > > > >> >As per the recent updates of the math-related code bases, the > >> >public API should provide factory methods (constructors should > >> >be private). > >> -- private constructors will make public API classes non-extensible. This > >> will severely restrict the extensibility of this framework which I want > to > >> avoid. I am not sure why we need to remove public constructors. It would > be > >> helpful if you could refer me to any relevant discussion thread. > > > Allowing extensibility is a huge burden on library maintainers. The > > library must have been designed to support it; hence, you should > > first describe what kind(s) of extensions (with usage examples) you > > have in mind. > --The library should be extensible to support customization. Users should > be able to customise or provide their own implementation of genetic > operators for crossover and mutation. The chromosome classes should also be > open for extension. >I don't get why we should support extensions outside this library. *--* I think we should not block the extension. >Initially we discussed about having a light-weight library, for easier usage >than alternative existing framework(s). *--* We can always think of making the framework lightweight but it should not cost extensibility. >> E.g. any developer should be able to extend the >> IntegralChromosome class and define a child class which explicitly >> specifies the range of integers to be used. >It does not look like this would need an extension, only configuration >of the range. *-- *I agree. But the question is should we block the extension. >> I have initially implemented >> the Binary chromosome and the corresponding binary mutation following the >> same pattern. However, restricting extension of concrete classes by private >> constructor does not prevent users from extending the abstract parent >> classes. >We should aim at coding the GA logic through (Java) interfaces, and not >expose the "abstract" classes. *-- *One of the primary reasons for me to contribute in Apache' GA library is it's simplicity and extensibility. I would like to have a framework which should be always extensible for any problem domain with minor changes. The primary reason behind this is that application domains of GA are too diverse. It is not possible to implement everything in a library. We don't know all possible domain areas too. If we remove the extensibility from the framework it would be useless in lots of areas. >Extending the functionality, if necessary, should be contributed back here *-- *Sometimes the GA operators are very much specific to the domain and it's hard to generalise. In those scenarios contributing back to the library might not be possible. However, if a library cannot be extended for a new domain by users it becomes underutilised over time if not useless. Thanks & Regards --Avijit Basak On Tue, 21 Dec 2021 at 22:05, Gilles Sadowski <gillese...@gmail.com> wrote: > Hello. > > Le mar. 21 déc. 2021 à 16:21, Avijit Basak <avijit.ba...@gmail.com> a > écrit : > > > > Hi All > > > > Please see my comments. Sorry for the delayed response. > > > > >Several problems with this approach (raised in previous messages IIRC): > > >1. Potential performance loss in sharing the same RNG instance. > > -- As per my understanding ThreadLocalRandomSource creates separate > > instances of UniformRandomProvider for each thread. So I am not sure how > a > > UniformRandomProvider instance is being shared. Please correct me if I am > > wrong. > > Within a given thread there will be *one* RNG instance; that's what I meant > by "shared". > Of course you are right that that instance is not shared by multiple > threads > (which would be a bug). > The performance loss is because it will be necessary to call > ThreadLocalRandomSource.current(RandomSource source) > for each access to the RNG (since it would be a bug to store the returned > value in e.g. an operator instance that would be shared among threads (as > you suggest below). > > > >2. Less/no flexibility (no user's choice of random source). > > -- Agreed. > > >3. Error-prone (user can access/reuse the "UniformRandomProvider" > > instances). > > > > >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct but > > >"light" usage of random number generation in a multi-threaded > application; > > GAs > > >make "heavy" use of RNG, thus it is does not seem outlandish that all > the > > RNG > > >"clients" (e.g. every "operator") creates their own instances. > > > > > > >IMHO, a more important discussion would be about the expectations in a > > >multithreaded context: E.g. should an operator be shareable by different > > >threads? And if not, how does the API help application developers to > avoid > > >such pitfalls? > > -- Once we implement multi-threading in GA, same crossover and mutation > > operators will be re-used across multiple threads. > > I would be wary to go on that path; better consider making (deep) copies. > We can have multiple instances of an operator, all being configured in the > same way but being different instances with no risk of a multithreading > bug. > > > So even if we provide > > the customization at the operator level we cannot avoid sharing. > > We can, and we should. > What we probably can't avoid sharing is the instance that represents the > population of chromosomes. > > > > > >> My original implementation did not allow any customization of > > RandomSource > > >> instances. There was a thought in review for customization of > > RandomSource, > > >> so these options were considered. I don't think this would make any > > >> difference to algorithm functionality. > > > > > Quite right. But the customization can come at zero cost for the > users > > > who don't need it. Admittedly it's a little more work on the part of > the > > > developer(s) but it's a one off cost (and I'm fine working on that > part > > of > > > the library once other, more important, things have been settled). > > > > >> Even earlier I used Math.random() > > >> which worked equally well. So my *vote* should be *against* this > > >> customization. > > > > > Mine is against using "ThreadLocalRandomSource"... > > -- What is the wayout other than that. Please suggest. > > I think I did. > Maybe it's time to create a dedicated branch for the GA functionality > so that we can try out the different approaches. > > > > > >> I think first we need to decide on whether we really need this > > >> customization and if yes then why. Then we can decide on alternate > > >> implementation options. > > > > > >> >As per the recent updates of the math-related code bases, the > > >> >public API should provide factory methods (constructors should > > >> >be private). > > >> -- private constructors will make public API classes non-extensible. > This > > >> will severely restrict the extensibility of this framework which I > want > > to > > >> avoid. I am not sure why we need to remove public constructors. It > would > > be > > >> helpful if you could refer me to any relevant discussion thread. > > > > > Allowing extensibility is a huge burden on library maintainers. The > > > library must have been designed to support it; hence, you should > > > first describe what kind(s) of extensions (with usage examples) you > > > have in mind. > > --The library should be extensible to support customization. Users should > > be able to customise or provide their own implementation of genetic > > operators for crossover and mutation. The chromosome classes should also > be > > open for extension. > > I don't get why we should support extensions outside this library. > Initially we discussed about having a light-weight library, for easier > usage > than alternative existing framework(s). > > > E.g. any developer should be able to extend the > > IntegralChromosome class and define a child class which explicitly > > specifies the range of integers to be used. > > It does not look like this would need an extension, only configuration > of the range. > > > I have initially implemented > > the Binary chromosome and the corresponding binary mutation following the > > same pattern. However, restricting extension of concrete classes by > private > > constructor does not prevent users from extending the abstract parent > > classes. > > We should aim at coding the GA logic through (Java) interfaces, and not > expose the "abstract" classes. > Extending the functionality, if necessary, should be contributed back here. > > Regards, > Gilles > > >>> [...] > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak