Re: Mahout Vs Spark

Brian Dolan Wed, 22 Oct 2014 05:10:04 -0700

Sing it, brother!  I miss FP Growth as well.  Once the Scala bindings are in, 
I'm hoping to work up some time series methods.


On Oct 21, 2014, at 8:00 PM, Lee S <[email protected]> wrote:

> As a developer, who is facing the library  chosen between mahout and mllib,
> I have some idea below.
> Mahout has no any decision tree algorithm. But MLLIB has the components of
> constructing a decision tree algorithm such as gini index, information
> gain. And also  I think mahout can add algorithm about frequency pattern
> mining which is very import in feature selection and statistic analysis.
> MLLIB has no frequent mining algorithms.
> p.s Why fpgrowth algorithm is removed in version 0.9?
> 
> 2014-10-22 9:12 GMT+08:00 Vibhanshu Prasad <[email protected]>:
> 
>> actually spark is available in python also, so users of spark are having an
>> upper hand over users of traditional users of mahout. This is applicable to
>> all the libraries of python (including numpy).
>> 
>> On Wed, Oct 22, 2014 at 3:54 AM, Ted Dunning <[email protected]>
>> wrote:
>> 
>>> On Tue, Oct 21, 2014 at 3:04 PM, Mahesh Balija <
>> [email protected]
>>>> 
>>> wrote:
>>> 
>>>> I am trying to differentiate between Mahout and Spark, here is the
>> small
>>>> list,
>>>> 
>>>>  Features Mahout Spark  Clustering Y Y  Classification Y Y
>> Regression Y
>>>> Y  Dimensionality Reduction Y Y  Java Y Y  Scala N Y  Python N Y
>> Numpy N
>>>> Y  Hadoop Y Y  Text Mining Y N  Scala/Spark Bindings Y N/A
>> scalability Y
>>>> Y
>>>> 
>>> 
>>> Mahout doesn't actually have strong features for clustering,
>> classification
>>> and regression. Mahout is very strong in recommendations (which you don't
>>> mention) and dimensionality reduction.
>>> 
>>> Mahout does support scala in the development version.
>>> 
>>> What do you mean by support for Numpy?
>>> 
>>

Re: Mahout Vs Spark

Reply via email to