> I want to impose an additional constraint. When 2 clusters are combined and > the > cost of combination is equal for multiple cluster pairs, I want to choose the > pair for which the combined cluster has the least size.
> What is the cleanest and easiest way of achieving this? I don't think that the public API enables you to do that. So I think that you are going to have to modify the code, and modify the cost heapq to make it a tuple of "(distance, size)". Unfortunately, when doing this, you'll be on your own, as we cannot provide support for modified code. Cheers, Gaƫl _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn