Re: Mahout Vs Spark

Ted Dunning Thu, 23 Oct 2014 13:12:23 -0700

Hmmm....

I don't think that the array formats used by Spark are compatible with the
formats used by numpy.


I could be wrong, but even if there isn't outright incompatibility, there
is likely to be some significant overhead in format conversion.


On Tue, Oct 21, 2014 at 6:12 PM, Vibhanshu Prasad <[email protected]>
wrote:

> actually spark is available in python also, so users of spark are having an
> upper hand over users of traditional users of mahout. This is applicable to
> all the libraries of python (including numpy).
>
> On Wed, Oct 22, 2014 at 3:54 AM, Ted Dunning <[email protected]>
> wrote:
>
> > On Tue, Oct 21, 2014 at 3:04 PM, Mahesh Balija <
> [email protected]
> > >
> > wrote:
> >
> > > I am trying to differentiate between Mahout and Spark, here is the
> small
> > > list,
> > >
> > >   Features Mahout Spark  Clustering Y Y  Classification Y Y
> Regression Y
> > > Y  Dimensionality Reduction Y Y  Java Y Y  Scala N Y  Python N Y
> Numpy N
> > > Y  Hadoop Y Y  Text Mining Y N  Scala/Spark Bindings Y N/A
> scalability Y
> > > Y
> > >
> >
> > Mahout doesn't actually have strong features for clustering,
> classification
> > and regression. Mahout is very strong in recommendations (which you don't
> > mention) and dimensionality reduction.
> >
> > Mahout does support scala in the development version.
> >
> > What do you mean by support for Numpy?
> >
>

Re: Mahout Vs Spark

Reply via email to