Okay, it seems that methodology is a bit too advanced for me. I would go
with framework/engineering tasks. So should I start with fixing the mahout
spark shell?

On Tue, Jun 16, 2015 at 11:20 AM, Dmitriy Lyubimov <[email protected]>
wrote:

> As i said, in methodology you can pick _anything_ that you think has merit
> and not yet in the roadmap or done.
>
> For example, do you feel like you might research PSVM or interior point
> SVM? Actually, any flavor of non-linear SVM that is different from a simple
> hinge loss?
> Do you think you can fit it in our algebraic engine?
>
> I think we also need a fair amount of port of MR methods -- like seq2sparse
> and cvb0 lda.
>
> i would still look at framework performance tasks, they are badly needed.
> Just today listened about flyby matrix multiplication approach for spark
> for medium-sized matrices which probably beats our since even though we do
> not use cartesian (god forbid), our implementation is somewhat closer to
> what the speaker described as "massively mapside join" -- which eventually,
> according to him, is supposed to gain over flyby multiply, but there's a
> fair amount of tasks when it is not .
>
> similarly bolting on hardware libraries for in-core operations is still a
> big undecided issue.
>
> unfortunately a lot of known outstanding issues are still about
> engineering.
>
>
> On Mon, Jun 15, 2015 at 10:27 PM, Rohit Shinde <
> [email protected]>
> wrote:
>
> > I would prefer some methodology work if it falls within my capabilities.
> If
> > it doesn't then your suggestion is a good one and I'll take it up.
> > Substantial according to me means a task where I can get quite familiar
> > with as much of the code base as possible.
> >
> > On Tue, Jun 16, 2015 at 10:49 AM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > > I gave you 3 types of problems. Define substantial.
> > >
> > > Say, does fixing mahout spark shell sound substantial enough?
> > >
> > > On Mon, Jun 15, 2015 at 10:11 PM, Rohit Shinde <
> > > [email protected]>
> > > wrote:
> > >
> > > > So do you have any suggestions for getting started? I would like to
> > > > contribute to something substantial that is going on, after getting
> > > > familiar with the required part of the codebase.
> > > >
> > > > On Mon, Jun 15, 2015 at 11:39 PM, Dmitriy Lyubimov <
> [email protected]>
> > > > wrote:
> > > >
> > > > > i don't think there's a formal list published anywhere.
> > > > >
> > > > > There is an informal roadmap.
> > > > >
> > > > > The contributions are, the way i see it, mainly can be in 3 areas:
> > (1)
> > > > > project support issues like for example fixing shell compatibility
> > with
> > > > > spark 1.3; (2) framework support problems like for example
> > performance
> > > > and
> > > > > integrating 3rd party hardware accelerated linalg libraries; (3)
> > > > > methodology work.
> > > > >
> > > > > We have some pending items for (1) and (2) i think but for
> > methodology
> > > > > items (3) we simply can't compile the list of everything that can
> > > > possibly
> > > > > be done and contriubted. We just don't have that much expertise,
> > > > combined.
> > > > > No one has [1]. The way it works is usually people would come up
> with
> > > > > pieces that they were missing on their own for some reason; and
> they
> > > need
> > > > > to propose methodology, parallelization strategy, maybe even a code
> > > > sketch
> > > > > -- that all will be fine.
> > > > >
> > > > > [1] http://matt.might.net/articles/phd-school-in-pictures/
> > > > >
> > > > > On Sun, Jun 14, 2015 at 11:49 PM, Rohit Shinde <
> > > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > But is there a list of projects that new people could take up?
> > Even I
> > > > am
> > > > > a
> > > > > > student interested in contributing to the machine learning and
> data
> > > > > mining
> > > > > > parts of Apache Mahout.
> > > > > >
> > > > > > I am familiar with Scala and Java, Python and C++.
> > > > > >
> > > > > > What can I contribute to?
> > > > > >
> > > > > > On Mon, Jun 15, 2015 at 10:24 AM, Dmitriy Lyubimov <
> > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Well we are predominantly Scala shop now. Being fluent in Scala
> > > seems
> > > > > > like
> > > > > > > one prerequisite.
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Jun 13, 2015 at 1:17 AM, Sreenivas Raghavan <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >                   I am interested in contributing to mahout
> > > > project.
> > > > > I
> > > > > > am
> > > > > > > > interested in algorithms, machine learning and linear
> algebra.
> > > > Please
> > > > > > > give
> > > > > > > > me some idea as where to start and how to start. I know
> python
> > > and
> > > > > some
> > > > > > > > parts of Java, so please tell me is this knowledge of
> languages
> > > > > enough
> > > > > > > for
> > > > > > > > writing and optimizing codes
> > > > > > > > --
> > > > > > > >
> > > > > > > > *With Regards,*
> > > > > > > > *K.S.Sreenivasa Raghavan*
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to