There's quite a good description of WEKA and its capabilities on the course page for a module I took this year: http://www.inf.ed.ac.uk/teaching/courses/dme/html/software2.html

It's more a general suite of data-mining tools rather than a tool to address a specific task like Taste (plus it's obviously not implemented for parallel processing which could be problematic for scaling up). From the link above:

   * *Advantages*: The obvious advantage of a package like Weka is that
     *a whole range of data preparation, feature selection and data
     mining algorithms are integrated*. This means that only one data
     format is needed, and trying out and comparing different
     approaches becomes really easy. The package also comes with *a
     GUI*, which should make it easier to use.

   * *Disadvantages*: Probably the most important disadvantage of data
     mining suites like this is that *they do not implement the newest
     techniques*. For example the MLP implemented has a very basic
     training algorithm (backprop with momentum), and the SVM only uses
     polynomial kernels, and does not support numeric estimation. ...
     *A third possible problem is scaling*. For difficult tasks on
     large datasets, the running time can become quite long, and java
     sometimes gives an OutOfMemory error. This problem can be reduced
     by using the '-mx/x/' option when calling java, where /x/ is
     memory size (eg '50m'). For large datasets it will always be
     necessary to reduce the size to be able to work within reasonable
     time limits. A fourth problem is that *the GUI does not implement
     all the possible options*. Things that could be very useful, like
     scoring of a test set, are not provided in the GUI, but can be
     called from the command line interface. So sometimes it will be
     necessary to switch between GUI and command line. Finally, *the
     data preparation and visualisation techniques offered might not be
     enough*. Most of them are very useful, but I think in most data
     mining tasks you will need more to get to know the data well and
     to get it in the right format.


Hope that's helpful :-)


Satish Dandu wrote:
Hi,
   Recently i started using Taste. It's easy to set up and it really
looks good in terms of picking recommendation (demo using Group lens
dataset for Netflix data).  I also went through weka, now my question is
there any difference between WEKA and Taste (as both are open source
machine learning softwares). What advantages can we get by using Taste
(in addition to hadoop integration)

Thanks



Reply via email to