I don't think that data is sufficiently clusterable to expect a unique solution.
Mean squared error would be a better measure of quality. On Mon, Jan 5, 2015 at 10:07 PM, Lee S <sle...@gmail.com> wrote: > Data in thie link: > > http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data > . > I convert it to sequencefile with InputDriver. > > 2015-01-06 14:04 GMT+08:00 Ted Dunning <ted.dunn...@gmail.com>: > > > What kind of synthetic data did you use? > > > > > > > > On Mon, Jan 5, 2015 at 8:29 PM, Lee S <sle...@gmail.com> wrote: > > > > > Hi, I used the synthetic data to test the kmeans method. > > > And I write the code own to convert center points to sequecefiles. > > > Then I ran the kmeans with parameter( -i input -o output -c center -x 3 > > -cd > > > 1 -cl) , > > > I compared the dumped clusteredPoints with the result of scikit-learn > > kmens > > > result, it's totally different. I'm very confused. > > > > > > Does anybody ever run kmeans with center points provided and compare > the > > > result with other ml-library? > > > > > >