Power law size scaling.

On Sun, Jul 8, 2012 at 11:39 PM, Ted Dunning <[email protected]> wrote:
> What do you mean by self similarity?  Power law size scaling?  Or that two 
> successive clusterings get nearly the same answer?
>
> Sent from my iPhone
>
> On Jul 8, 2012, at 8:40 PM, Lance Norskog <[email protected]> wrote:
>
>> Are there any measures of self-similarity?
>>
>> On Sun, Jul 8, 2012 at 6:07 PM, Ted Dunning <[email protected]> wrote:
>>
>>> I can't comment on the existing evaluators, but for me the only real
>>> measure that I care about is average distance to nearest cluster for new or
>>> held-out data.  I will be building something of this sort for the
>>> clustering part of the knn code I have been working on.
>>>
>>>
>>> On Sun, Jul 8, 2012 at 5:44 PM, Pat Ferrel <[email protected]> wrote:
>>>
>>>> To use something like kmeans on any large and changing data set it seems
>>>> a requirement that there be some means of evaluating the quality of
>>>> clusters at different scales. The usual eyeballing breaks down quickly.
>>>>
>>>> Trying to use the cluster evaluators in Mahout with kmeans as the
>>>> clustering method and cosine and the distance measure has proven
>>>> problematic. The method is to iterate through the data using different ks
>>>> and performing the evaluation at each point. What I find is that certain
>>>> values are almost always in error. The Intra-cluster density from
>>>> ClusterEvaluator is almost always NaN. The CDbw  inter-cluster density is
>>>> almost always 0. I have also seen several cases where CDbw fails to return
>>>> any results but have not tracked down why yet.
>>>>
>>>> Given that the data for either evaluator is usually incomplete these
>>>> methods are not very useful. Is mahout dropping the evaluators? Is the
>>>> general wisdom that they are not particularly useful? Should a newer method
>>>> be pursued? This seems a fairly important question to me, am I missing
>>>> something?
>>>>
>>>> Raw data for a sample crawl is given below:
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Lance Norskog
>> [email protected]



-- 
Lance Norskog
[email protected]

Reply via email to