Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT)
Page: Books Tutorials and Talks
(https://cwiki.apache.org/confluence/display/MAHOUT/Books+Tutorials+and+Talks)
Edited by Grant Ingersoll:
---------------------------------------------------------------------
{toc:style=disc|indent=20px}
h1. Intro
This page is a place to put links to info about talks (past and upcoming),
tutorials, articles, books, slides, PDFs, discussions, etc. about Mahout,
Machine Learning and related technologies. No endorsements are implied or
given. Please keep all listings in alphabetical order within each section.
h1. Background Material
* [Reference Reading]
h1. Books
* [Mahout in Action|http://www.manning.com/owen/] \- Book by Sean Owen and
Robin Anil, published by Manning Publications.
* [Taming Text|http://www.manning.com/ingersoll/] \- By Grant Ingersoll and Tom
Morton, published by Manning Publications. Will have some Mahout coverage, but
by no means as complete as Mahout in Action.
* [Data Mining: Practical Machine Learning Tools and
Techniques|http://www.cs.waikato.ac.nz/~ml/weka/book.html]
* [Programming Collective
Intelligence|http://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325/ref=pd_bbs_sr_1/104-1017533-9408723?ie=UTF8&s=books&qid=1214593516&sr=1-1]
* [Collective Intelligence in
Action|http://www.amazon.com/Collective-Intelligence-Action-Satnam-Alag/dp/1933988312/ref=pd_bbs_sr_3?ie=UTF8&s=books&qid=1214545249&sr=1-3]
* [Machine
Learning|http://www.amazon.com/Machine-Learning-Mcgraw-Hill-International-Edit/dp/0071154671/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1214593709&sr=8-1]
* [Pattern Recognition and Machine Learning (Information Science and
Statistics)
|http://www.amazon.com/Pattern-Recognition-Learning-Information-Statistics/dp/0387310738/ref=pd_bbs_sr_2?ie=UTF8&s=books&qid=1214593709&sr=8-2]
* [Introduction to Information
Retrieval|http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html]
* [Information Theory, Inference, and Learning Algorithms by David MacKay|
http://www.inference.phy.cam.ac.uk/itprnn/book.html]
* [Text Mining Application Programming |
http://www.amazon.com/Text-Mining-Application-Programming/dp/1584504609]
* [Algorithms of the Intelligent
Web|http://www.amazon.com/Algorithms-Intelligent-Web-Haralambos-Marmanis/dp/1933988665/ref=sr_1_1?s=books&ie=UTF8&qid=1298005918&sr=1-1]
h1. News, Articles and Tutorials
* [Apache Mahout: Scalable Machine Learning for
Everyone|http://www.ibm.com/developerworks/java/library/j-mahout-scaling/]
* [Scaling up Cassandra and Mahout with
Hadoop|http://www.acunu.com/blogs/sean-owen/scaling-cassandra-and-mahout-hadoop/]
* [Recommending (from)
Cassandra|http://www.acunu.com/blogs/sean-owen/recommending-cassandra/]
* [How to build a spam filter server with
Mahout|http://emmaespina.wordpress.com/2011/04/26/ham-spam-and-elephants-or-how-to-build-a-spam-filter-server-with-mahout/]
\- Applying classification on a live server - April 2011
* [Deploying a massively scalable recommender system with Apache
Mahout|http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/]
\- Blogpost of Sebastian Schelter in April 2011
* [Apache Mahout & the commoditization of machine learning
|http://www.redmonk.com/cote/2010/11/04/makeall013/] \- Podcast interview with
Grant Ingersoll at ApacheCon 2010
* [Apache Mahout 0.4 mit neuen
Algorithmen|http://isabel-drost.de/hadoop/slides/devoxx.pdf] \- published after
the 0.4 release by heise Open/ Developer, November 2010
* [Mahout on InfoQ|http://www.infoq.com/news/2009/04/mahout] \- Interview with
Grant Ingersoll on InfoQ
* [Mahout in the Cloudera
weblog|http://www.cloudera.com/blog/2009/04/21/hadoop-uk-user-group-meeting/]
\- published after the Hadoop user group UK.
* [Mahout in the Drools
weblog|http://blog.athico.com/2008/08/machine-learning-and-apache-mahout.html]
\- Michael Neale published an article on Mahout in the drools weblog
* [Introducing Apache
Mahout|https://www.ibm.com/developerworks/java/library/j-mahout/index.html] \-
Grant Ingersoll - Intro to Apache Mahout focused on clustering, classification
and collaborative filtering.
Japanese translation available at:
[http://www.ibm.com/developerworks/jp/java/library/j-mahout/]
* [Flexible Collaborative Filtering In Java With Mahout
Taste|http://philippeadjiman.com/blog/2009/11/11/flexible-collaborative-filtering-in-java-with-mahout-taste/]
\- Philippe Adjiman - Quick starting guide on how to use the collaborative
filtering package of Mahout (called Taste) to quickly and flexibly create, test
and compare tailored recommendation engines.
* [Integrating Mahout with Lucene and
Solr|http://www.lucidimagination.com/blog/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/]
Three part series on ways to integrate Mahout with Lucene and Solr
h1. Links
* [Collection of links to presentations on learning
algorithms|http://www.inma.ucl.ac.be/~francois/blog/entries/entry_757.php]
h1. Coursework/Lectures
* [http://videolectures.net/mlss05us_chicago/]
* [http://videolectures.net/mlas06_pittsburgh/]
* [http://wiki.ailab.wsu.edu/ml/index.php/Main_Page]
* [Stanford Lectures on Machine Learning by Andrew
Ng|http://see.stanford.edu/see/lecturelist.aspx?coll=348ca38a-3a6d-4052-937d-cb017338d7b1]
h1. Talks
*Let's keep these in reverse chronological order, so that most recent talks are
at the top*
* [Composing Mahout clustering
jobs|http://berlinbuzzwords.de/sites/berlinbuzzwords.de/files/composing-mahout-clustering-jobs.pdf]
\- Slides from Frank Scholten at Berlin Buzzwords on June 7, 2011.
* Introduction to Collaborative Filtering using Mahout (updated) \- Talk by
Sean Owen at the London Hadoop User Group on April 14, 2011.
* [Cool Tricks with
Classifiers|http://www.meetup.com/LA-HUG/pages/Video_from_March_16th_LA-HUG_Ted_Dunning_Mahout]
\- Talk by Ted Dunning at the Los Angeles HUG talking about Mahout classifiers
on March 16, 2011.
* [Mahout
Hackathon|http://blog.isabel-drost.de/index.php/archives/325/apache-mahout-hackathon-berlin-2]
\- event write up of the first Mahout Hackathon, Berlin, March 2011.
* [Mahout
meetup|http://blog.jteam.nl/2011/01/13/announcement-lucene-nl-mahout-meetup-with-isabel-drost-feb-7/]
\- there were two talks at the Apache Mahout meetup at JTeam in Amsterdam,
February 2011. ([Intro slides|http://isabel-drost.de/hadoop/slides/jteam.pdf]
* [Mahout clustering |
http://www.fosdem.org/2011/schedule/event/mahoutclustering] \- Talk on Mahout
clustering at data dev room FOSDEM, February 2011.
* [Scaling Data Analysis with Apache Mahout |
http://strataconf.com/strata2011/public/schedule/detail/16827] \- talk on
Mahout at O'Reilly Strata, February 2011.
* [Practical Machine
Learning|http://www.slideshare.net/jaganadhg/mahout-tutorial-fossmeet-nitc]\-
Slides from Biju B and Jaganadh G, FOSSMEET-NITC, Calicut, India, February 2011.
* [Mahout at AlphaCSP's The Edge 2010
(pdf)|http://www.javaedge.com/jedge/pdf/Mahout.pdf] -
[(slideshare)|http://www.slideshare.net/arikogan/mahouts-presentation-at-alphacsps-the-edge-2010]
\- Slides from [Ariel Kogan|http://il.linkedin.com/in/arielkogan], AlphaCSP's
The Edge, December 2010.
* [Intelligent data analysis with Apache
Mahout|http://isabel-drost.de/hadoop/slides/devoxx.pdf] \- Slides from Isabel
Drost, Devoxx Antwerp, November 2010.
* [Apache Mahout
introduction|http://isabel-drost.de/hadoop/slides/codebits.pdf] \- Slides from
Isabel Drost, codebits Lisbon, November 2010.
* [Apache Mahout - Making Data Analysis
Easy|http://isabel-drost.de/hadoop/slides/apachecon_2010.pdf] \- Slides from
Isabel Drost, Apache Con US Atlanta, November 2010.
* [Practical Machine Learning|http://www.slideshare.net/jaganadhg/bck9]\-
Slides from Jaganadh G, BarCamp Kerala 9, November 2010.
* [Mahout and its new classification
framework|http://www.slideshare.net/tdunning/sdforum-11042010]\- Slides from
Ted Dunning, SDForum, November 2010.
* [Distributed Itembased Collaborative Filtering with Apache
Mahout|http://www.slideshare.net/sscdotopen/mahoutcf] \- Slides from Sebastian
Schelter, Hadoop Get Together Berlin, October 2010.
* [Hidden Markov Models for
Mahout|http://isabel-drost.de/hadoop/slides/HMM.pdf] \- Slides from Max Heimel,
Hadoop Get Together Berlin, October 2010.
* [Apache Mahout Mammoth Scale Machine Learning
|http://www.slideshare.net/robinanil/oscon-apache-mahout-mammoth-scale-machine-learning]
\- Slides from Robin Anil, OSCON 2010.
* [Intro to Apache Mahout|http://slidesha.re/9LxOIu] \- Slides from Grant
Ingersoll, RTP Semantic Web Group.
* [Case study: Biometric Databases and Hadoop
|http://www.slideshare.net/ydn/3-biometric-hadoopsummit2010] \- Slides from
Jason Trost, Hadoop Summit 2010.
* [Spam Fighting at
Yahoo|http://www.slideshare.net/hadoopusergroup/mail-antispam?from=ss_embed]
* [Web Mining with Ken
Krugler|http://www.slideshare.net/hadoopusergroup/bixo-hug-talk?from=ss_embed]
* [Keynote on intelligent
search|http://berlinbuzzwords.wikidot.com/local--files/links-to-slides/ingersoll_bbuzz2010.pdf]
\- Slides from Grant Ingersoll, Berlin Buzzwords, June 2010.
* [Simple co-occurrence-based recommendation on
Hadoop|http://berlinbuzzwords.wikidot.com/local--files/links-to-slides/owen_bbuzz2010.pdf]
\- Slides from Sean Owen, Berlin Buzzwords, June, 2010.
* [Introduction to Collaborative Filtering using
Mahout|http://berlinbuzzwords.wikidot.com/local--files/links-to-slides/scholten_bbuzz2010.odp]
\- Slides from Frank Scholten, Berlin Buzzwords, June, 2010.
* [Introduction to Scalable Machine
Learning|http://lucene.grantingersoll.com/2010/02/16/trijug-intro-to-mahout-slides-and-demo-examples/]
\- Slides and demos from Grant Ingersoll, March, 2010.
* [Mahout @ India Hadoop
Summit|http://www.scribd.com/doc/27637351/Mahout-India-Hadoop-Summit] \- Slides
from a 1 hour talk on Mahout at the India Hadoop Summit by Robin Anil, February
2010.
* [Mahout in 10
minutes|http://www.isabel-drost.de/hadoop/slides/opensourceexpo09.pdf] \-
Slides from a 10 min intro to Mahout at the Map Reduce tutorial by David Zülke
at Open Source Expo in Karlsruhe, Isabel Drost, November 2009.
* [Mahout at Apache Con US
|http://www.isabel-drost.de/hadoop/slides/apacheconus2009.pdf] \- Slides from a
talk on "Going from raw data to information" (with Mahout) at Apache Con US in
Oakland, Isabel Drost, November 2009.
* [Mahout at FrOSCon|http://www.isabel-drost.de/hadoop/slides/froscon2009.pdf]
\- Slides from a talk on Mahout at FrOSCon in Sankt Augustin, Isabel Drost,
August 2009.
* [Mahout at DAI group TU
Berlin|http://www.isabel-drost.de/hadoop/slides/dai.pdf] \- Slides from a talk
on Mahout at the DAI Laboratories TU Berlin, Isabel Drost, July 2009.
* [Machine Learning course at HPI
Potsdam|http://www.hpi.uni-potsdam.de/naumann/lehre/ss_09/mapreduce_algorithms_on_hadoop.html]
that relies on Hadoop for efficient implementation. ([Some
slides|http://www.isabel-drost.de/hadoop/slides/ewen.pdf] that try to explain,
why students taking this course should come over and have a look at and
participate in Mahout.)
* [Mahout at Machine Learning Group TU
Berlin|http://www.isabel-drost.de/hadoop/slides/ulf.pdf] \- Slides from a talk
on Hadoop with some detour to Mahout at the Machine Learning Group of Prof. Dr.
Klaus-Robert Müller at TU Berlin, Isabel Drost, June 2009.
* [Mahout at DIMA TU
Berlin|http://http://www.isabel-drost.de/hadoop/slides/dima.pdf] \- Slides from
the research colloquium at DIMA (Fachgebiet Datenbanksysteme und
Informationsmanagement, Prof. Dr. rer. nat. Volker Markl) TU Berlin, Isabel
Drost, May 2009.
* [Mahout at Google Zürich|http://www.isabel-drost.de/hadoop/slides/google.pdf]
\- Slides from a Google tech-talk on the past, present and future of Mahout,
Isabel Drost, May 2009.
* [Hadoop user group
UK|http://static.last.fm/johan/huguk-20090414/isabel_drost-introducing_apache_mahout.pdf]
\- Slides from a talk on April 14, 2009 at the Hadoop User Group UK in London,
Isabel Drost, April 2009.
* [BI Over Petabytes: Meet Apache
Mahout|http://cwiki.apache.org/confluence/download/attachments/88410/SDForum.pdf]
\- Slides from a talk by Jeff Eastman on April 21, 2009 at the Bay Area SD
Forum Business Intelligence SIG meeting at SAP in Palo Alto, CA.
* Lucene Meetup and Apache Barcamp in Amsterdam, March 2009.
* [BarCampRDU|http://barcamp.org/BarCampRDU] \- No guarantee it will be
scheduled, but Grant Ingersoll will be at BarCampRDU (Raleigh) on Aug. 2, 2008
and would like to talk with people interested in Mahout and Hadoop.
* [Introducing Mahout: Apache Machine
Learning|http://www.us.apachecon.com/us2008] \- Committer Grant Ingersoll will
be giving a gentle introduction to Mahout and Machine Learning at ApacheCon in
November (3rd through 7th) in New Orleans, USA. Schedule TBD.
* [Mahout: Scaling Machine Learning|http://www.froscon.org/] \- Introduction to
Mahout and machine learning at FrOSCon in Sankt Augustin/Germany, Isabel Drost,
August 2008.
([Slides|http://cwiki.apache.org/confluence/download/attachments/88410/froscon.pdf])
* [Mahout: Scalable Machine Learning|http://upcoming.yahoo.com/event/807782/]
\- An introduction to Mahout and machine learning at the first German Hadoop
gathering in newthinking store/ Berlin, Isabel Drost, July 2008.
* Apache Mahout: Industrial Strength Machine Learning - Committer Jeff Eastman
gave an introduction to Mahout at Yahoo\!, May 2008
* [Apache Lucene - Mach's wie
Google|http://people.apache.org/~berndf/openexpode08-lucene-talk.pdf] \- Bernd
Fondermann presented an overview of the Apache Lucene project, including Mahout
at Open Source Expo 2008 in Karlsruhe, May 2008.
* Apache Mahout: Bringing Machine Learning to Industrial Strength - Committer
Isabel Drost gave a Fast Feather introduction the the new project Mahout at
Apache Con EU April, 2008
Change your notification preferences:
https://cwiki.apache.org/confluence/users/viewnotifications.action