That's a really good question. Mahout does not have an "explain" feature; however, you can use the ClusterDumper to print out the cluster centers and vectors clustered within each cluster. Output is pretty verbose and, with large text vectors being truncated, might not be that useful. You might need to write something to do this. Look at the cluster evaluator tests for some hints.

Which algorithm were you using?

On 2/4/13 1:57 PM, Chris Harrington wrote:
I was wondering if there was an explain feature in Mahout, something that gives 
the reason why it did what it did, shows the values of the various features it 
used to evaluate and choose the result, etc.

Because I have some wildly different text data being clustered together, for 
example it clustered these 2 together and I'd like to be able to figure out why

Text 1: "Iron Butterfly Bassist Lee Dorman Dies at 70"

Text 2: "The BEST Memes Of 2012 2012 was a landmark year for memes -- and we could 
say that due to the Ikea Monkey alone -- but it's not always easy…"


Reply via email to