if you want to contribute to Mahout, obviously you want to speak to Mahout
dev audience. Spark is not yet officially integrated into Mahout, but we
are actively contemplating it and I have been doing some work off SVN e.g.
https://issues.apache.org/jira/browse/MAHOUT-1346,
https://issues.apache.org/jira/browse/MAHOUT-1365 and some other algorithm
ports.


On Tue, Jan 7, 2014 at 1:30 PM, Oleksandr Olgashko <[email protected]
> wrote:

> Didn't work with Spark before (just read their overview page).
> Should i ask arising questions here or better switch to Spark's mailing
> lists?
>
>
> 2014/1/7 Sebastian Schelter <[email protected]>
>
> > IIRC that papers talks about MapReduce on a shared-memory system, not on
> > a shared-nothing system such as the Hadoop implementation.
> >
> > As a rule of thumb, iterations in Hadoop are about 10x slower than in
> > systems such as Giraph, Spark or Stratosphere.
> >
> > --sebastian
> >
> > On 07.01.2014 22:01, Oleksandr Olgashko wrote:
> > > What can you say about
> > >
> >
> http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf
> > ?
> > >
> > >
> > > 2014/1/7 Dmitriy Lyubimov <[email protected]>
> > >
> > >> yes. Create working notes how exactly to do that.  (Or, what i am a
> bit
> > >> pushing you towards, Spark, since MR is not really iteration friendly
> > >> platform and it looks like iterations are needed in fastICA.).
> > >>
> > >>
> > >> On Tue, Jan 7, 2014 at 12:38 PM, Oleksandr Olgashko <
> > >> [email protected]> wrote:
> > >>
> > >>> So the problem is to adapt ICA for MR, am i right?
> > >>>
> > >>>
> > >>>
> > >>> 2014/1/7 Dmitriy Lyubimov <[email protected]>
> > >>>
> > >>>> i already looked at fast ICA. while it claims to be parallel, this
> > work
> > >>>> doesn't exactly map it into map reduce (or spark) paradigm and from
> > >> what
> > >>> i
> > >>>> can recollect still implies outer iterations for fitting principal
> > >>>> component vectors one by one. Which means it probably already is
> > >>>> MR-unfriendly by construction; Spark may show far better promise
> here
> > >> but
> > >>>> still a working notes document is required to show how exactly.
> that's
> > >>> what
> > >>>> i mean.
> > >>>>
> > >>>>
> > >>>> On Tue, Jan 7, 2014 at 1:35 AM, Oleksandr Olgashko <
> > >>>> [email protected]
> > >>>>> wrote:
> > >>>>
> > >>>>> Could you please take a look on this article?
> > >>>>> http://cran.r-project.org/web/packages/fastICA/fastICA.pdf
> > >>>>> I have learned that re-inventing the wheel is wrong for most
> > >> problems,
> > >>>> and
> > >>>>> usually exists a better solution. However, it often needs some
> > >>>> "grinding",
> > >>>>> so I may research those ways, in case of approval.
> > >>>>>
> > >>>>> About Scala: unfortunately, I have never worked with this language
> > >>>> before,
> > >>>>> but wanted to. I'd like to fill that gap in my skills, but I don't
> > >> know
> > >>>>> exactly where to start.
> > >>>>>
> > >>>>>
> > >>>>> 2014/1/7 Dmitriy Lyubimov <[email protected]>
> > >>>>>
> > >>>>>> ICA is a very useful technique for dimensionality reduction. I
> > >>> believe
> > >>>>>> Mahout would benefit from it; however challenges are fairly
> > >>> significant
> > >>>>> in
> > >>>>>> terms of proven parallelization technique and acceptable efficacy,
> > >>>> which
> > >>>>>> makes it hard to just "implement" (I am not familiar at this point
> > >>> with
> > >>>>> any
> > >>>>>> concrete work on parallel ICA). So like i said before i am not
> very
> > >>>>>> hopeful. However, if one never tries, then nothing will get ever
> > >>> done.
> > >>>>> who
> > >>>>>> knows.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Mon, Jan 6, 2014 at 2:18 PM, Isabel Drost-Fromm <
> > >>> [email protected]
> > >>>>>>> wrote:
> > >>>>>>
> > >>>>>>> On Mon, Jan 06, 2014 at 10:40:45PM +0200, Oleksandr Olgashko
> > >> wrote:
> > >>>>>>>> Returning back to question about theme to work, asked 2 months
> > >>> ago.
> > >>>>>>>> What algorithm should I implement?
> > >>>>>>>
> > >>>>>>> To be quite frank with you: None. Personally I'd rather see
> > >>>>> improvements
> > >>>>>>> (in terms of documentation, integration, stableisation,
> > >> performance
> > >>>>>>> optimisation) of the existing Mahout source.
> > >>>>>>>
> > >>>>>>> Feel free to take a closer look at the thread concerning "getting
> > >>>>>>> involved" that we had around Christmas last year for inspiration.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Isabel
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>

Reply via email to