Re: kmeans vectors

Ted Dunning Fri, 01 Oct 2010 10:21:27 -0700

No there isn't.  Your other option is to use kmeans directly and set k (as
you seem to do now).

t1 and t2 can also be quite delicate parameters.

My own tendency is to try to use a good initialization scheme such as
kmeans++ (which we don't
yet have) and just specify the number of clusters.  If none of the clusters
are small, I increase k.
If some have just a very few points, I decrease k.  Then I look for temporal
stability of cluster
size.  At that point, the clusters are the clusters and I rarely change
them.

My justification for this is that clustering for me is just a way to do
recoding of input variables
along the lines of a volume quantization (coding each point with just the
cluster) or near
diagonalization (coding each point with distance to all clusters).

On Fri, Oct 1, 2010 at 7:22 AM, Matt Tanquary <[email protected]>wrote:

> I played around with the t1 and t2 until I got a k that I expected
> with my small set, but if I want to ensure say 3 clusters on a large
> set of data, then how to I use t1 and t2 to set k? Is there a formula
> for that?
>
> On Thu, Sep 30, 2010 at 8:24 PM, Lahiru Samarakoon <[email protected]>
> wrote:
> > Hi Matt,
> >
> > As Jeff has mentioned earlier, you have to choose t1 and t2 to get the k
> > when you are using * syntheticcontrol.kmeans.Job* program. So what you
> have
> > experienced is correct.
> >
> > Thanks,
> > Lahiru
> >
>
>
>
> --
> Have you thanked a teacher today? ---> http://www.liftateacher.org
>

Re: kmeans vectors

Reply via email to