Proposal for a FastFeather Talk for the new Lucene sub project Mahout.

=Title=

Mahout - Bringing Machine Learning to Industrial Strength


=Short description=

In recent years the amount of information available in digital form has 
increased tremendously. Most of the information is available in unstructured 
form, such as plain or html formatted texts.

There is a growing need for tools that are capable of extracting information 
from these texts, rank them according to user queries, separate them into 
topics. A lot of valuable algorithms for extracting information from texts 
have been published. 

The Mahout project, which is a Lucene sub project since January 2008, aims to 
provide a commercial friendly, stable, and scalable suite of machine learning 
tools. The framework will be designed for high throughput and be capable of 
handling massive datasets. We focus on scalability and intend to provide 
parallelised machine learning algorithm implementations based on the Hadoop 
framework.

This talk will present the very young project Mahout. The presentation will 
give a brief overview of the project's history, the people behind Mahout and 
our initial as well as long term goals. 

=Bio of speaker=

After studying computer science Isabel Drost worked as research assistant at 
the "Humboldt Universtät zu Berlin". While there Isabel was working on link 
based clustering of documents and identification of search engine spam. In 
2005/6 she interned at Google for six months. Currently Isabel is employed at 
the neofonie GmbH, a company building specialised search engines.


-- 
The superfluous is very necessary.              -- Voltaire
  |\      _,,,---,,_       Web:   <http://www.isabel-drost.de>
  /,`.-'`'    -.  ;-;;,_
 |,4-  ) )-,_..;\ (  `'-'
'---''(_/--'  `-'\_) (fL)  IM:  <xmpp://[EMAIL PROTECTED]>

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to