Andrew,
Sebastian and I were talking yesterday and guessing that you would be
interested in this soon. Glad to know the world is as expected.
Yes. This needs to happen at least at a very conceptual level. For
instance, for classifiers, I think that we need to have something like:
- progressively train against a batch of data
questions: should this do multiple epochs? Throw an exception if
on-line training not supported? throw an exception if too little data
provided?
- classify a batch of data
- serialize a model
- de-serialize a model
Note that a batch listed above should be either a bunch of observations or
just one.
Question: does this handle the following cases:
- naive bayes
- SGD trained on continuous data
- batch trained <mumble> classifiers
- downpour type classifier training
?
On Wed, May 28, 2014 at 6:25 PM, Andrew Palumbo <[email protected]> wrote:
> This may be somewhat tangential to this thread, but would now be a good
> time to start laying out some scala traits for
> Classifiers/Clusterers/Recommenders? I am totally scala-naive, but have
> been trying to keep up with the discussions.
>
> I don't know if this is premature but it seems that now that the DSL data
> structures have been at least sketched out if not fully implemented, it
> would be useful to have these in place before people start porting too much
> over. It might be helpful in bringing in new contributions as well.
>
> It could also help regarding people's questions of integrating a future
> wrapper layer.
>
>
>
> > From: [email protected]
> > Date: Wed, 28 May 2014 17:10:43 -0700
> > Subject: Re: do we really need scala still
> > To: [email protected]
> >
> > +1
> >
> > Let's use a successful scala model as a suggestion about where to go. It
> > seems plausible that Java could emulate the building of a lazy DSL
> logical
> > plan and then poke it in plausible ways with the addition of a wrapper
> > layer. But that only helps if the Scala layer succeeds.
> >
> >
> >
> > On Tue, May 27, 2014 at 10:56 AM, Dmitriy Lyubimov <[email protected]
> >wrote:
> >
> > > Also, i think that this is leaning towards false dilemma fallacy.
> Scala and
> > > java models could happily exist at the same time and hopefully, minimal
> > > fragmentation of the project if done with precision and care.
> > >
> > >
> > > On Tue, May 27, 2014 at 10:46 AM, Dmitriy Lyubimov <[email protected]
> > > >wrote:
> > >
> > > >
> > > > not sure there's much sense in taking user survey if we can't act on
> > > this.
> > > > In our situation, unfortunately, we don't have that many ideas to
> choose
> > > > from, so there's not much wiggle room imo. It is more like
> reinforcement
> > > > learning -- stuff that doesn't get used or supported, just dies
> .that's
> > > it.
> > > > Scala bindings, though thumb up'd internally, are yet to earn this
> status
> > > > externally. In that sense we always have been watching for
> use/support,
> > > > that's why we culled out tons of stuff. Nothing changes going
> forward (at
> > > > least at this point). If we have tons of new ideas/contributions,
> then it
> > > > may be different. What is weak, dies on its own pretty evidently
> without
> > > > much extra effort.
> > > >
> > > >
> > > > On Tue, May 27, 2014 at 10:32 AM, Pat Ferrel <[email protected]>
> > > wrote:
> > > >
> > > >> We are asking that anyone using Mahout as a lib or in the DSL-shell
> to
> > > >> learn Scala. While I still think it’s the right idea, user’s may
> > > disagree.
> > > >> We should probably either solicit comments or at least keep an eye
> on
> > > >> reactions to this. Spark took this route when the question was even
> > > more in
> > > >> doubt and so is at least partially supporting multiple bindings.
> > > >>
> > > >> Not sure how far we want to carry this but we could supply Java
> bindings
> > > >> to the CLI-type things pretty easily.
> > > >>
> > > >>
> > > >> On May 26, 2014, at 2:43 PM, Dmitriy Lyubimov <[email protected]>
> > > wrote:
> > > >>
> > > >> Well, first, functional programming in java8 is about 2-3 years
> late to
> > > >> the
> > > >> scene. So the reasoning along the lines, hey, we already are using
> tool
> > > A,
> > > >> and now tool B is available which is almost as good as A, so let's
> > > migrate
> > > >> to B, is fallible. Tool B must demonstrate not just matching
> > > capabilities,
> > > >> but far superb, to justify cost of such migration.
> > > >>
> > > >> Second, as other pointed, java 8 doesn't really match scala, not yet
> > > >> anyway. One important feature of scala bindings work is proper
> operator
> > > >> overload (R-like DSL). That would not be possible to do in java 8,
> as it
> > > >> stands. Yes, as other pointed, it makes things concise, but most
> > > >> importantly, it also makes things operation-centric and eliminates
> > > nested
> > > >> calls pile-up.
> > > >>
> > > >> Third, as it stands today, it would also presentn a problem from the
> > > Spark
> > > >> integration point of view. Spark does have java bindings, but first,
> > > they
> > > >> are underdefined (you can check spark list for tons of postings
> about
> > > >> missing equivalent capability), and they are certainly not
> > > java-8-vetted.
> > > >> So java api in Spark for java 8 purposes, as it stands, is a moot
> point.
> > > >>
> > > >> There are also a number other goodies and clashes that exist -- use
> of
> > > >> scala collections vs. Java collections, clean functional type
> syntax,
> > > >> magic
> > > >> methods, partially defined functions, case class matchers,
> implicits,
> > > view
> > > >> and context bounds etc. Etc., all that sh$tload of acrobatics that
> comes
> > > >> actually very handy in existing implemetations and has no
> substitute in
> > > >> Java 8.
> > > >> On May 25, 2014 12:48 PM, "bandi shankar" <[email protected]>
> > > wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > I was just thinking , do we still need scala . Since in java 8 we
> have
> > > >> > all(probably) kind of feature provided by scala.
> > > >> > Since I am new to group , so just thinking why not to make mahout
> away
> > > >> > from scala. Is there any specific reason to adopt scala.
> > > >> >
> > > >> > Bandi
> > > >> >
> > > >>
> > > >>
> > > >
> > >
>
>