Re: [math] Re: commons math

John Gant Mon, 15 Aug 2005 19:15:07 -0700

IP stuff:
I will send out a link to the pdf that describes KMotif, and the cross
correlation comes from
http://mathworld.wolfram.com/CorrelationCoefficient.html with an
implementation that correlates column-wise. Both euclidean and
city-block distance measures come from basic data mining textbooks (my
textbook is Data Mining by Mehmed Kantardzic) or
http://www.statsoft.com/textbook/stcluan.html. Please let me know if
this is sufficient, or if I need more references.


Distance measures, are basically a numeric way of classifying a
relationship between two numerical or categorical datasets. Usually
distance measures are used in conjunction with k-means, or
hierarchical clustering (or some type of clustering algorithm).

I think the architecture question applies to K-means and
difference/similarity algorithms. I am not sure of the best
architecture for these algorithms. Should each distance/similarity
measure be its own class, allowing these to be passed into an engine
that is the clustering algorithm? For instance have a k-means class
who has a private variable of type ClusertingMeasurementAlgorithm,
where:

EuclideanDistance which implements,
DistanceMeasure which implements,
ClusteringMeasurementAlgorithm

Does this sound somewhat logical? If we had an engine that took an
instance of ClusteringMeasurementAlgorithm as a constructor parameter,
it could handle all operations on the data using the specific
measurement algorithm. The reason I am trying to abstract the
clustering algorithm more than a difference measure is due to the fact
that clustering may be done on similiarity and difference measures.
Please tell me if this sounds outrageous, because I do not have alot
of architecture experience.

Thanks,
John

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [math] Re: commons math

Reply via email to