Hi all, I'm running kmeans to cluster some text docs and some docs that are seemingly unrelated to the cluster (i.e. noise) are getting clustered and I wish to leave them unclustered.
I thought the clusterClassificationThreshold variable would do this for me from the java doc clusterClassificationThreshold * Is a clustering strictness / outlier removal parameter. Its value should be between 0 and 1. Vectors * having pdf below this value will not be clustered. but when ever I change this value no clustered points get written and there doesn't seem to be any change in the clusters, no matter what value I set (tried 0.00001 and 0.99999) Did I misunderstand what this variable does or am I missing here?
