Hi all,
I am co-developer of RapidMiner (formerly Yale,
http://www.rapidminer.com). Isa Drost pointet the Mahout project out to
me. We are currently working on a distributed version of RapidMiner that
will run in a cluster. The focus is, however, very different from the
one of Mahout. We are focusing on distributed databases and highly
complex tasks, such as evolutionary computing for feature selection or
parameter optimization. These task require to perform many training and
evaluation cycles on variants of the same data mining task (e.g. using
different feature sets). Therefore, we decided not to use map/reduce for
this kind of application but some more traditional methods of
distributed data mining, task distribution and scheduling.
Creating a map/reduce based data mining library/system is sure highly
relevant. I'm looking forward to the first results of Mahout! If you
need any assistance, let me know.
Cheers,
Michael
- Distributed RapidMiner Michael Wurst
-