Suppose you have a two-class problem and, for instance, class 0 is much bigger than class 1.
Is it possible that the centroid initially chosen for class 0 overlaps the one chosen for class 1 so that in the end the false negative rate is very high? I found situations when this phenomenon occurs, and the explanation above is the only one I could think of. I don't think max_iter too small would cause this issue. In fact, if class 0 is much bigger than class 1, both the centroids should remain inside class 0 and false positive rate should always be small regardless. If that's the case, why is that the underlying implementation of k-means does not take this into account? Thanks,
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general