No there isn't. Your other option is to use kmeans directly and set k (as you seem to do now).
t1 and t2 can also be quite delicate parameters. My own tendency is to try to use a good initialization scheme such as kmeans++ (which we don't yet have) and just specify the number of clusters. If none of the clusters are small, I increase k. If some have just a very few points, I decrease k. Then I look for temporal stability of cluster size. At that point, the clusters are the clusters and I rarely change them. My justification for this is that clustering for me is just a way to do recoding of input variables along the lines of a volume quantization (coding each point with just the cluster) or near diagonalization (coding each point with distance to all clusters). On Fri, Oct 1, 2010 at 7:22 AM, Matt Tanquary <[email protected]>wrote: > I played around with the t1 and t2 until I got a k that I expected > with my small set, but if I want to ensure say 3 clusters on a large > set of data, then how to I use t1 and t2 to set k? Is there a formula > for that? > > On Thu, Sep 30, 2010 at 8:24 PM, Lahiru Samarakoon <[email protected]> > wrote: > > Hi Matt, > > > > As Jeff has mentioned earlier, you have to choose t1 and t2 to get the k > > when you are using * syntheticcontrol.kmeans.Job* program. So what you > have > > experienced is correct. > > > > Thanks, > > Lahiru > > > > > > -- > Have you thanked a teacher today? ---> http://www.liftateacher.org >
