Algorithms (MAHOUT) edited by David Hall
Page: http://cwiki.apache.org/confluence/display/MAHOUT/Algorithms
Changes:
http://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=119837&originalVersion=1&revisedVersion=2
Comment:
---------------------------------------------------------------------
Entry for LDA
Change summary:
---------------------------------------------------------------------
Entry for LDA
Change summary:
---------------------------------------------------------------------
Entry for LDA
Change summary:
---------------------------------------------------------------------
Entry for LDA
Change summary:
---------------------------------------------------------------------
Entry for LDA
Content:
---------------------------------------------------------------------
h2. Algorithms
This section contains links to information, examples, use cases, etc. for the
various algorithms we intend to implement. Click the individual links to learn
more. The initial algorithms descriptions have been copied here from the
original project proposal. The algorithms are grouped by the application
setting, they can be used for. In case of multiple applications, the version
presented in the paper was chosen, versions as implemented in our project will
be added as soon as we are working on them.
Original Paper: [Map Reduce for Machine Learning on
Multicore|http://www.cs.stanford.edu/people/ang//papers/nips06-mapreducemulticore.pdf]
Papers related to Map Reduce:
* [Evaluating MapReduce for Multi-core and Multiprocessor
Systems|http://csl.stanford.edu/~christos/publications/2007.cmp_mapreduce.hpca.pdf]
* [Map Reduce: Distributed Computing for Machine
Learning|http://www.icsi.berkeley.edu/~arlo/publications/gillick_cs262a_proj.pdf]
For Papers, videos and books related to machine learning in general, see
[Machine Learning Resources]
All algorithms are either marked as _integrated_, that is the implementation is
integrated into the development version of Mahout. Algorithms that are
currently being developed are annotated with a link to the JIRA issue that
deals with the specific implementation. Usually these issues already contain
patches that are more or less major, depending on how much work was spent on
the issue so far. Algorithms that have so far not been touched are marked as
_open_.
[What, When, Where, Why (but not How or Who)] \- Community tips, tricks, etc.
for when to use which algorithm in what situations, what to watch out for in
terms of errors. That is, practical advice on using Mahout for your problems.
h3. Classification
A general introduction to the most common text classification algorithms can be
found at Google Answers:
[http://answers.google.com/answers/main?cmd=threadview&id=225316] For
information on the algorithms implemented in Mahout (or scheduled for
implementation) please visit the following pages.
[Logistic Regression] (open)
[Bayesian]
[Support Vector Machines] (SVM) (open:
[MAHOUT-14|http://issues.apache.org/jira/browse/MAHOUT-14])
[Perceptron and Winnow] (open:
[MAHOUT-85|http://issues.apache.org/jira/browse/MAHOUT-85]
[Neural Network] (open)
[Random Forests] (open)
h3. Clustering
[Reference Reading]
[Canopy Clustering] (integrated)
[k-Means] (integrated)
[Fuzzy K-Means] ([MAHOUT-74|https://issues.apache.org/jira/browse/MAHOUT-74])
(integrated)
[Expectation Maximization] (EM)
([MAHOUT-28|http://issues.apache.org/jira/browse/MAHOUT-28])
[Mean Shift] (integrated)
[Hierarchical Clustering]
([MAHOUT-19|http://issues.apache.org/jira/browse/MAHOUT-19])
[Dirichlet Process Clustering]
([MAHOUT-30|http://issues.apache.org/jira/browse/MAHOUT-30])
[Latent Dirichlet Allocation]
([MAHOUT-123|http://issues.apache.org/jira/browse/MAHOUT-123])
h3. Regression
[Locally Weighted Linear Regression] (open)
h3. Dimension reduction
[Principal Components Analysis] (PCA) (open)
[Independent Component Analysis] (open)
[Gaussian Discriminative Analysis] (GDA) (open)
h3. Evolutionary Algorithms
see also: [MAHOUT-56
(integrated)|http://issues.apache.org/jira/browse/MAHOUT-56]
You will find here information, examples, use cases, etc. related to
Evolutionary Algorithms.
Introductions and Tutorials:
* [Evolutionary Algorithms
Introduction|http://www.geatbx.com/docu/algindex.html]
* [How to distribute the fitness evaluation using Mahout.GA|Mahout.GA.Tutorial]
Examples:
* [Traveling Salesman]
* [Class Discovery]
h3. Non map reduce algorithms
Some algorithms and applications appeared on the mailing list, that have not
been published in map reduce form so far. As we do not restrict ourselves to
hadoop-only versions, these proposals are listed here.
[Hidden Markov Models] (HMM) (open)
[Recommendation Learning] (integrated)
---------------------------------------------------------------------
CONFLUENCE INFORMATION
This message is automatically generated by Confluence
Unsubscribe or edit your notifications preferences
http://cwiki.apache.org/confluence/users/viewnotifications.action
If you think it was sent incorrectly contact one of the administrators
http://cwiki.apache.org/confluence/administrators.action
If you want more information on Confluence, or have a bug to report see
http://www.atlassian.com/software/confluence