That sounds like 3-dimensional thinking.

High dimensional problems abound and have very different properties.

On Wed, Apr 27, 2011 at 1:55 PM, Paul Mahon <[email protected]> wrote:

> No, I mean the area. If all the vectors fit in a AxBxC sized box, and you
> expect about 10 clusters, you could make an initial guess that the clusters
> will be (A/10)xBxC in size and you could try T1=(A/10)*B*C. I've no idea how
> well this would work in practice... probably not very well.
>
> On 04/27/2011 01:50 PM, Camilo Lopez wrote:
>
>> By area of the space you mean just the total number of vectors I'm using?
>> On 2011-04-27, at 4:46 PM, Paul Mahon wrote:
>>
>>  If you have a guess at how many clusters you want you could take the
>>> total area of the space and divide by the number of clusters to get an
>>> initial guess of T2 or T1. That might work to get you started, depending on
>>> the distribution.
>>>
>>> On 04/27/2011 12:39 PM, Camilo Lopez wrote:
>>>
>>>> I'm using Canopy as first step for K-means clustering, is there any
>>>> algorithmic, or even a good heuristic to estimate good T1 and T2 from the
>>>> vectorized data?
>>>>
>>>

Reply via email to