I don't think that data is sufficiently clusterable to expect a unique
solution.

Mean squared error would be a better measure of quality.



On Mon, Jan 5, 2015 at 10:07 PM, Lee S <sle...@gmail.com> wrote:

> Data in thie link:
>
> http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
> .
> I convert it to sequencefile with InputDriver.
>
> 2015-01-06 14:04 GMT+08:00 Ted Dunning <ted.dunn...@gmail.com>:
>
> > What kind of synthetic data did you use?
> >
> >
> >
> > On Mon, Jan 5, 2015 at 8:29 PM, Lee S <sle...@gmail.com> wrote:
> >
> > > Hi, I used the synthetic data to test the kmeans method.
> > > And I write the code own to convert center points to sequecefiles.
> > > Then I ran the kmeans with parameter( -i input -o output -c center -x 3
> > -cd
> > > 1  -cl) ,
> > > I compared the dumped clusteredPoints with the result of scikit-learn
> > kmens
> > > result, it's totally different. I'm very confused.
> > >
> > > Does anybody ever run kmeans with center points provided and compare
> the
> > > result with other ml-library?
> > >
> >
>

Reply via email to