That sounds like 3-dimensional thinking. High dimensional problems abound and have very different properties.
On Wed, Apr 27, 2011 at 1:55 PM, Paul Mahon <[email protected]> wrote: > No, I mean the area. If all the vectors fit in a AxBxC sized box, and you > expect about 10 clusters, you could make an initial guess that the clusters > will be (A/10)xBxC in size and you could try T1=(A/10)*B*C. I've no idea how > well this would work in practice... probably not very well. > > On 04/27/2011 01:50 PM, Camilo Lopez wrote: > >> By area of the space you mean just the total number of vectors I'm using? >> On 2011-04-27, at 4:46 PM, Paul Mahon wrote: >> >> If you have a guess at how many clusters you want you could take the >>> total area of the space and divide by the number of clusters to get an >>> initial guess of T2 or T1. That might work to get you started, depending on >>> the distribution. >>> >>> On 04/27/2011 12:39 PM, Camilo Lopez wrote: >>> >>>> I'm using Canopy as first step for K-means clustering, is there any >>>> algorithmic, or even a good heuristic to estimate good T1 and T2 from the >>>> vectorized data? >>>> >>>
