Le dim. 14 févr. 2021 à 09:06, Avijit Basak <avijit.ba...@gmail.com> a écrit :
>
> Hi
>
>        I would like to mention a few points here. Genetic Algorithm has a
> vast range of applications in optimization and search problems. Machine
> learning is only one of those.
>        If we couple the new GA library with any specific domain like ml it
> would be meaningless for people working in other domains.

Isn't "meaningless" a slight overstatement?
We might have an issue of terminology: There is no necessary "coupling"
but maybe "acquaintance" (for lack of a better word), as a set of tools that
might come in handy for solving certain types of problems.  [For example,
the Traveling Salesman Problem can be tackled by GA and SOFM, both
of which are candidate for inclusion in the new component, although they
don't share any code.]

If the name "machine learning" is not the most appropriate one to convey
the intended scope, do you have another idea?
["AI" would perhaps be more correct if we consider a strict hierarchy, but
would obviously be far too presumptuous.]

> They have to
> incorporate the entire ml library

No, they won't.  Given the stated goal of "modularity": the "ga" module
will be available as a dedicated JAR (possibly with a dependency to
codes that can be reused in other modules provided by the component).

> which may be completely unrelated to
> their project. Coupling it with any technology like spark might also limit
> it's usability.

You may be right; I have no idea about the "restrictions" imposed by
Spark.  [It seems that in this case, one would have to indeed depend
on Spark's "mllib" (?).  This would be one reason, as I already stated,
for having something in "Commons".]

Could you elaborate on a concrete use-case where one would be
starting to develop an application with the specific requirement that
Spark could not be used?
In particular, IIRC Spark has multi-threading built in.  Don't you see
it as a huge problem that CM would not provide such a feature?

>        If a separate component is not approved for this change then we can
> incorporate the changes as part of *commons.math* library.

Of course, if somebody wants to do that, he's welcome.
[That will not be me, for all the reasons which I've explained.  In the last
5 years I've been pretty much alone in handling bug reports about CM;
I'm unwilling to assume implicit support for even more codes.]

Also, with this solution, you'd now be willing to accept what you weren't
above: Anyone wanting to use the GA functionality would indeed have to
"incorporate" the whole of "Commons Math" (CM).
Of course, the latter could be modularized, but this will only mitigate the
issue, as any release of the GA functionality will potentially be then held
off by potential issues in other parts of CM (which nobody has been able
to consistently support for more than 5 years now).

>        The same library can be reused in ml or neural network libraries as
> a dependency.

It is the other way around:  The development version of CM currently
depends on "lower-level" components.
Furthermore, right now its (embryonic) "machine learning" functionality
hasn't any substantial dependency on codes outside the "o.a.c.math4.ml"
package.

>        Kindly share further views on this.

In summary, to be clarified:
 (1) Why not Spark?  [At least post over there (?).]
 (2) Further develop a monolithic CM?  [Who will do it?]
 (3) Modularize CM? [Who will do it?]
 (4) New component (with another name) with the proposed contents?

To make things clear from my side:  As a *user*, I've currently some
stake at having a clean, independent "ml" component or an independent
"sofm" module.  So I could do (4).  Or help with (3), on the condition that
*other* people get things moving.

Regards,
Gilles

>
> Thanks & Regards
> --Avijit Basak
>
> On Wed, 10 Feb 2021 at 19:49, Gilles Sadowski <gillese...@gmail.com> wrote:
>
> > Le mer. 10 févr. 2021 à 13:19, sebb <seb...@gmail.com> a écrit :
> > >
> > > Likewise, commons-ml is too cryptic.
> > >
> > > Also, the Spark project has a machine-learning library:
> > >
> > > https://spark.apache.org/mllib/
> >
> > Thanks for the pointer.
> >
> > >
> > > Maybe that would be better home?
> >
> > On the face of it, probably.
> > [For sure, Avijit should comment on the suggestion.]
> >
> > On the other hand, "Commons" is the place where one can pick "bare
> > bone" implementations, and add the functionality to one's application
> > without necessarily comply with an overarching framework.
> > [I don't mean that framework compliance is bad; quite the contrary, it is
> > hopefully the result of a thorough reflection by experts.  But ... cf. the
> > numerous "no-dependency" discussions ...]
> >
> > Actually, concerning Avijit's proposed contribution, didn't I say:[1]
> > ---CUT---
> > Thus, I think that we must assess whether the "genetic algorithms"
> > functionality has a reasonable future within "Apache Commons" (i.e.
> > potential users and contributors) while there exist other libraries that
> > seem much more advanced for any serious usage.
> > ---CUT---
> >
> > > I'm also a bit concerned as to whether there are sufficient developers
> > > here with knowledge of the ML domain to be able to support the code in
> > > the future.
> >
> > An interesting point; by all means not a new one (see e.g. [2]).
> >
> > Isn't it the same point I've been making about "Commons Math" (CM)?
> > There has been no releases because nobody here is able (or is willing
> > to) support it.
> >
> > Concerning the support of the purported "machinelearning" component:
> > 1. Package
> >         org.apache.commons.math4.ml.neuralnet
> >     * I've written it entirely and I have applications that depend on it
> > (and I
> >       cannot assume that I could easily switch to, or port it to, Spark),
> > so I
> >       can reasonably ensure that it would be supported.
> > 2. Package
> >         org.apache.commons.math4.ml.clustering
> >     * Functionality is mentioned in Spark's "mllib" user guide.
> >     * When a new feature was last contributed[3], it was noticed[4][5][6]
> >       that improvement were needed (but there was no follow-up).
> >     * I've an application that depend on it (from CM v3.6.1) but I wouldn't
> >       support it if shipped in CM v4.0.
> > 3. Package
> >         org.apache.commons.math4.genetics
> >     * Part of my "end-of-study" project consisted in a GA implementation.
> >       I've never used the CM implementation, and I don't deny that there
> >       could be perfectly fine uses of it but, just looking at the code, it
> > seems
> >       obvious that it cannot compete feature-wise with other libraries
> > out there.
> >     * I've suggested long ago that, without anyone supporting it actively
> > (and
> >       no known user community), it should be dropped from CM.
> >     * Avijit expressed a willingness to improve the functionality:  Is
> > this enough
> >       for the PMC to create a new component?  From the experience with the
> >       "clustering" package mentioned above, I'd tend to think
> > (unfortunately)
> >       that it isn't.  He should first explore whether the Spark community
> > is
> >       interested, that the GA functionality be moved over there.
> >
> > Gilles
> >
> > [1] https://issues.apache.org/jira/browse/MATH-1563
> > [2] https://markmail.org/message/26yxj5vhysdsoety
> > [3] https://issues.apache.org/jira/projects/MATH/issues/MATH-1509
> > [4] https://issues.apache.org/jira/projects/MATH/issues/MATH-1524
> > [5] https://issues.apache.org/jira/projects/MATH/issues/MATH-1528
> > [6] https://issues.apache.org/jira/projects/MATH/issues/MATH-1526
> >
> > >
> > > On Wed, 10 Feb 2021 at 08:27, Emmanuel Bourg <ebo...@apache.org> wrote:
> > > >
> > > > -1 for commons-ml for the same reasons.
> > > >
> > > > What about commons-machine-learning or commons-math-learning? The
> > latter
> > > > is as long as commons-configuration.
> > > >
> > > > Emmanuel Bourg
> > > >
> > > >
> > > > Le 2021-02-10 03:27, Ralph Goers a écrit :
> > > > > -1 on commons-ml as the name. My first thought is such a repo would
> > > > > hold stuff related to mailing lists. Then again maybe it contains
> > > > > stuff relating to markup languages. Maybe it is Apache’s version of
> > > > > the ML Programming Language [1].
> > > > >
> > > > > However, I wouldn’t be -1 on commons-math-ml, although at best I
> > would
> > > > > be +0 since it is still not obvious what it would contain.
> > > > >
> > > > > Ralph
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> --
> Avijit Basak

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to