Vladimir Nesov,

As I understand it, clustering is different than correlation. The
difference seems especially important when there are many variables
involved. Correlations are statistical patterns between observable
variables only, while clustering introduces the extra (hidden)
variable "what cluster does this case belong to?".

So correlations do not automatically give clusterings, because
correlations between a large number of variables might just be a large
number of pairwise relationships... on the other hand, a clustering
might be a fair approximation of the correlations in the data, even if
no hidden variables are actually involved.


Steve Richfield,

The SPI paper does make that constraint, but it also allows for
multiple clusterings; so within one clustering clusters are mutually
exclusive, but this does not really restrict things. Perhaps it would
be simpler to get rid of the constraint, making the multiple
clusterings unnecessary. In fact it would be "simpler" just to get rid
of all constraints on the hidden entities, and perform a general
search for the best hidden structure to explain the data... but, like
the constraint you mentioned, some things that seem like restrictions
are not really. For example, we could restrict the hidden entities to
form some particular Turing-complete language; examining the
constraints, it would at first look like a very harsh restriction, but
of course once one realized that it could express any computable
pattern it would be no more harsh than the original restriction to
1st-order logic.

Anyway, I do not have a clear picture of your dimensionality concern.
There are ways of clustering in domains where euclidean distance is
not relevant (particularly binary domains), but I do not understand
what you mean when you say that dimensions have unknown sizes.


Richard Loosemore,

I wondered if you would join in on this particular conversation. The
truth is that my question assumes the approach to AI that you reject,
loosely speaking... that is, I am asking if clustering is a good local
mechanism for creating the global behavior we want.

I am not exactly tying myself down to one formalism when I say
"clustering", however. But I am restricting the space... the concept
can generally be applied in "soft" systems, but that is quite broad
(anywhere from neural nets to fuzzy logic). I am especially interested
in formal probabilistic systems, within which the concept of
clustering is especially well defined.


Kingma,

K-means clustering is one of the forms of clustering I am referring
to, but I am also including things like the statistical predicate
invention in the paper I referred to, as well as Hopfield networks,
Sparse Distributed Memory, and other forms of associative memory. I
could rephrase my question: "Can generalize clustering enough for it
to be useful/essential for AGI, without losing the basic idea?"

On Sun, Jul 6, 2008 at 4:26 PM, Kingma, D.P. <[EMAIL PROTECTED]> wrote:
> On Sun, Jul 6, 2008 at 4:22 AM, Abram Demski <[EMAIL PROTECTED]> wrote:
> ...
>>
>> So the
>> question is: is clustering in general powerful enough for AGI? Is it
>> fundamental to how minds can and should work?
>
> You seem to be referring to k-means clustering, which assumes a special form
> of mixture model, which is a class of generative models. In any mixture
> model, we have to make some strong assumptions like
> 1) the particular types of distributions we're mixing;
> 2) the amount of clusters is fixed.
>
> Because we make such assumptions, we can solve the problem very fast by the
> EM algorithm. Some assumptions can be lifted by adding some parameter
> optimization (local search), obviously adding considerable computing time
> and not leading to optimal solutions.
>
> So that answered your question: mixture models lay strong assumptions on the
> distribution we're drawing samples from, so they're NOT very powerful in
> isolation.
>
> If you're searching for a generative model that is general and can
> approximate any distribution, you should look for example at Restricted
> Boltzmann Machines (Geoffrey Hinton):
> http://www.scholarpedia.org/article/Boltzmann_machine#Restricted_Boltzmann_machines
>
> Sadly, the best training methods are still computationally expensive, but
> they're becoming practical.
>
> Regards,
> Durk Kingma
>
> ________________________________
> agi | Archives | Modify Your Subscription


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=106510220-47b225
Powered by Listbox: http://www.listbox.com

Reply via email to