If you are looking for less than 5 clusters in a uni-dimensional
dataset with more than 200k points, it's very likely that the cluster
structure is preserved on much smaller random sub samples, for
instance 1000 points.

You can run the experiment several times with different random number
generator seeds to check that the result is not sensitive to a
specific random subsampling.

Which version of scikit-learn are you using?

-- 
Olivier

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to