Re: tf(idf)-ing the cluster output

Jeff Eastman Tue, 07 Feb 2012 08:32:48 -0800

Sure, love to hear more about your use case and pipeline. Can youdescribe the steps you are performing and how the results get utilized?


Jeff


On 2/7/12 9:28 AM, Viktor Gal wrote:

Hi,

::: i'm using mahout for computer vision, so my pipeline is a bit different 
from the text processing pipeline, i.e. after i've acquired the feature vectors 
i'm doing a clustering and after i've got the cluster centers and clustered the 
original feature vectors i'm doing the TF(IDF) vector calculation. This is a 
quite standard thing nowadays in computer vision...

so i've implemented the part for creating TF(IDF) vectors from the cluster 
output, based on DocumentVectorizer class. if anybody thinks that it'd be good 
to have this tool in mahout let me know so i'll create an issue for it JIRA and 
upload there my patches.

cheers,
viktor

Re: tf(idf)-ing the cluster output

Reply via email to