On Mon, Jul 7, 2008 at 1:39 AM, Abram Demski <[EMAIL PROTECTED]> wrote:
>
> As I understand it, clustering is different than correlation. The
> difference seems especially important when there are many variables
> involved. Correlations are statistical patterns between observable
> variables only, while clustering introduces the extra (hidden)
> variable "what cluster does this case belong to?".
>
> So correlations do not automatically give clusterings, because
> correlations between a large number of variables might just be a large
> number of pairwise relationships... on the other hand, a clustering
> might be a fair approximation of the correlations in the data, even if
> no hidden variables are actually involved.
>

When you decide when to include a point in the cluster and when not
to, you are using some kind of external signal (algorithm) to
determine that. You can guide the cluster by the properties of
probability distribution, trying to capture a local maximum in it,
leaving other maxima outside. Or you may want to find a cluster "over
there", but in this case you'd need some kind of (probably implicit)
"over there"-detector, another feature to include in the system.

-- 
Vladimir Nesov
[EMAIL PROTECTED]
http://causalityrelay.wordpress.com/


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=106510220-47b225
Powered by Listbox: http://www.listbox.com

Reply via email to