Abram (and Vladimir)

Thanks for the article hyperlink. I am also interested in clustering for my
own efforts. Mere is my problem, and I suspect your problems also:

Much real-world clustering seems to be in n-dimensional space where n is the
number of variables among which you are looking for clusters. With the
variables in different dimensions that have unknown size, computing distance
between "points" by the usual square root of the squares of the differences
doesn't make much sense at all. Instead, it seems that a "cluster" is simply
a set of nearly-identical "binary" (with probabilities) variables.

Some of the SPI paper seems rather arbitrary and unexplained, e.g. that a
point can only occur in one cluster (3rd page, 2nd column, ~2/3 of the way
down). This seems to presume that the method is only finding the LAST
REMAINING relationship, and ignores the possibility of noisy data...

The SPI paper is somewhat opaque with terminology and heavy with external
references, so it is rather hard to determine exactly how the issue of
high-dimensionality space with unknown sizes of its dimensions would fit
into its discussion.

Perhaps I have simply missed the point and/or am looking at this thing all
wrong. Can anyone here help?

Steve Richfield



-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=106510220-47b225
Powered by Listbox: http://www.listbox.com

Reply via email to