Re: Mahout on the cloud

Dmitriy Lyubimov Thu, 23 Jul 2015 14:51:39 -0700

Mahout is moving to be backend-agnostic. Supports same code on spark or
h20.


(Disclaimer: some code is quasi-agnostic, such as spark shell, or I think
some co-occurrence drivers also like Spark more than anything else. may be
wrong.)


On Thu, Jul 23, 2015 at 2:41 PM, Ankit Goel <[email protected]> wrote:

> Thanks a lot guys.
> @Pat is mahout only going to support scala in the near future? and will all
> the ml libraries only be from spark? I did read somewhere that mahout was
> heading towards a direction where its more of a framework that supports
> multiple ml libraries. Am I right in my understanding?
>
> On Thu, Jul 23, 2015 at 10:03 PM, Pat Ferrel <[email protected]>
> wrote:
>
> > Just to be clear, mahout runs on AWS just fine. Dmitriy is talking about
> > support and continuance of “MapReduce” which means Hadoop MapReduce. We
> > have been exclusively accepting only more modern engine code for more
> than
> > a year so most of the modern Mahout is in Scala and runs on Spark. The
> > MapReduce paradigm is certainly supported there but it runs on Spark so
> any
> > EMR instances you create should have Spark installed.
> >
> > Amazon now supports Spark on EMR:
> > https://aws.amazon.com/blogs/aws/new-apache-spark-on-amazon-emr/
> >
> > Make sure you use the correct version of Spark with Mahout. 0.10.0
> > supports Spark 1.1.1 or less, Mahout 0.10.1 supports Spark 1.2.1 or less,
> > the current master snapshot supports Spark 1.3 and runs on Spark 1.4.
> >
> > On Jul 23, 2015, at 7:28 AM, Ankit Goel <[email protected]> wrote:
> >
> > Thanks for the heads up Dmitriy..thats exactly the kind of warning I was
> > looking for. I dont have any experience implementing MR yet --i
> understand
> > the algo perfectly-- so this is a great heads up. Any advice oor warnings
> > on hadoop installations and versions??
> >
> > On Thu, Jul 23, 2015 at 6:34 AM, Dmitriy Lyubimov <[email protected]>
> > wrote:
> >
> > > MapReduce things enter de-facto end-of-life. Not that we specifically
> > don't
> > > want to support them, it is de-facto nobody bothers to support them --
> > > especially risks are high with new versions of hadoop and EMR.
> > >
> > > That said, we'd be grateful for any guide about doing this in EMR.
> > >
> > > On Wed, Jul 22, 2015 at 5:53 PM, Ankit Goel <[email protected]>
> > > wrote:
> > >
> > >> Hi,
> > >> After my runs on my lappy, I'm ready to port my work to the cloud.
> > > Planning
> > >> to use Amazon. One thing I noticed when I started with mahout, that
> > there
> > >> were a lot of things unsaid on the site/wiki and took me a lot of time
> > to
> > >> figure out. Pitfalls if I may call them. I will primarily be using
> > >> clustering on the cloud, so the code to accept new data and run it is
> > > what
> > >> I have for now.
> > >>
> > >> So before I port to the cloud, are there any things I should beware of
> > or
> > >> lookout for? Like is AWS fine with mahout? Are there any
> configurations
> > I
> > >> should remember? Any advice on implementation to ease my transition
> and
> > > run
> > >> mahout 24hrs? Thanks
> > >>
> > >> --
> > >> Regards,
> > >> Ankit Goel
> > >> http://about.me/ankitgoel
> > >>
> > >
> >
> >
> >
> > --
> > Regards,
> > Ankit Goel
> > http://about.me/ankitgoel
> >
> >
>
>
> --
> Regards,
> Ankit Goel
> http://about.me/ankitgoel
>

Re: Mahout on the cloud

Reply via email to