Proposal for a FastFeather Talk for the new Lucene sub project Mahout. =Title=
Mahout - Bringing Machine Learning to Industrial Strength =Short description= In recent years the amount of information available in digital form has increased tremendously. Most of the information is available in unstructured form, such as plain or html formatted texts. There is a growing need for tools that are capable of extracting information from these texts, rank them according to user queries, separate them into topics. A lot of valuable algorithms for extracting information from texts have been published. The Mahout project, which is a Lucene sub project since January 2008, aims to provide a commercial friendly, stable, and scalable suite of machine learning tools. The framework will be designed for high throughput and be capable of handling massive datasets. We focus on scalability and intend to provide parallelised machine learning algorithm implementations based on the Hadoop framework. This talk will present the very young project Mahout. The presentation will give a brief overview of the project's history, the people behind Mahout and our initial as well as long term goals. =Bio of speaker= After studying computer science Isabel Drost worked as research assistant at the "Humboldt Universtät zu Berlin". While there Isabel was working on link based clustering of documents and identification of search engine spam. In 2005/6 she interned at Google for six months. Currently Isabel is employed at the neofonie GmbH, a company building specialised search engines. -- The superfluous is very necessary. -- Voltaire |\ _,,,---,,_ Web: <http://www.isabel-drost.de> /,`.-'`' -. ;-;;,_ |,4- ) )-,_..;\ ( `'-' '---''(_/--' `-'\_) (fL) IM: <xmpp://[EMAIL PROTECTED]>
signature.asc
Description: This is a digitally signed message part.
