If you are looking for less than 5 clusters in a uni-dimensional dataset with more than 200k points, it's very likely that the cluster structure is preserved on much smaller random sub samples, for instance 1000 points.
You can run the experiment several times with different random number generator seeds to check that the result is not sensitive to a specific random subsampling. Which version of scikit-learn are you using? -- Olivier ------------------------------------------------------------------------------ Sponsored by Intel(R) XDK Develop, test and display web and hybrid apps with a single code base. Download it for free now! http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general