Hmmm.... I don't think that the array formats used by Spark are compatible with the formats used by numpy.
I could be wrong, but even if there isn't outright incompatibility, there is likely to be some significant overhead in format conversion. On Tue, Oct 21, 2014 at 6:12 PM, Vibhanshu Prasad <[email protected]> wrote: > actually spark is available in python also, so users of spark are having an > upper hand over users of traditional users of mahout. This is applicable to > all the libraries of python (including numpy). > > On Wed, Oct 22, 2014 at 3:54 AM, Ted Dunning <[email protected]> > wrote: > > > On Tue, Oct 21, 2014 at 3:04 PM, Mahesh Balija < > [email protected] > > > > > wrote: > > > > > I am trying to differentiate between Mahout and Spark, here is the > small > > > list, > > > > > > Features Mahout Spark Clustering Y Y Classification Y Y > Regression Y > > > Y Dimensionality Reduction Y Y Java Y Y Scala N Y Python N Y > Numpy N > > > Y Hadoop Y Y Text Mining Y N Scala/Spark Bindings Y N/A > scalability Y > > > Y > > > > > > > Mahout doesn't actually have strong features for clustering, > classification > > and regression. Mahout is very strong in recommendations (which you don't > > mention) and dimensionality reduction. > > > > Mahout does support scala in the development version. > > > > What do you mean by support for Numpy? > > >
