Re: Text categorization / classification

[email protected] Wed, 27 Oct 2010 19:13:52 -0700

Thanks a lot!
I was reading about Mahout today.
I'll try that out.
Thanks again
Maria


Sent from my iPhone


On Oct 27, 2010, at 20:59, Lance Norskog <[email protected]> wrote:

> There are tools for this in the Mahout project. These are oriented
> toward large-scale work.
> 
> http://mahout.apache.org
> 
> There is a big learning curve and you have to learn Hadoop somewhat.
> 
> The book 'Collective Intelligence' includes a suite of Python tools
> for small-scale experiments.
> 
> On Wed, Oct 27, 2010 at 1:12 PM, Maria Vazquez <[email protected]> wrote:
>> I need to auto-categorize a large number of documents. They are basically 
>> news articles from major news sources (nytimes, npr, abcnews, etc).
>> I'd like to categorize them automatically. Any suggestions?
>> Lucene in Action suggests using a set of documents to build category vectors 
>> and then comparing each document to each of those vectors and get the 
>> closest one.
>> The approach seems pretty simple (from other papers I read on text 
>> categorization) but maybe you guys know of something out there that already 
>> does this using Lucene/Solr.
>> Thanks!
>> Maria
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>> 
>> 
> 
> 
> 
> -- 
> Lance Norskog
> [email protected]
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Text categorization / classification

Reply via email to