Did you have a look at 'Taming Text' (by Grant S. Ingersoll, Thomas S. Morton, 
and Andrew L. Farris)?  There are some sections in this that might be relevant 
for your issue.


R

________________________________
 From: Neil Chaudhuri <[email protected]>
To: "[email protected]" <[email protected]> 
Sent: Friday, 2 December 2011, 3:08
Subject: Word and Phrase Clustering
 
I have a need to cluster a collection of words and phrases by syntactic 
similarity over a distributed environment, and I came upon Mahout as a possible 
solution. After studying the documentation though, I am finding all of it 
tailored to working with entire documents rather than words and phrases. I 
simply want to know if you believe that Mahout is the right tool for this job. 
I suppose I could try to view each word and phrase as individual tiny 
documents, but that feels like I am forcing it.

Any insight is appreciated.

Thanks.

Reply via email to