Do you mean: Given two numbers x and and y drawn from a specific sample S
of numbers (or a specific probability distribution D over the set of
numbers)?

Without this background S or D, the question is meaningless...

Given a distribution D, one can draw a sample S, of course; so the case
where one has a sample S is sufficient to deal with

One sensible measure would be: What ratio of the numbers in the sample S
are greater than max(x,y) or less than min(x,y) , as opposed to lying
between x and y?

This gives you 1 if x and y are identical or have no members of S between
them; and 0 if x and y are the opposite endpoints of the sample.

If you want a scaling between -1 and 1 instead of 0 and 1, just linearly
normalize...

In OpenCog, this (but without the normalization into [-1,1]) is how we
would measure similarity between two NumberNodes relative to a given
QuantitativeSchemaNode, consistent with our approach of quantile
normalization for predicatizing quantitative characters:

http://wiki.opencog.org/w/QuantitativePredicate

An advantage of ranking based approaches like these, is that they are
robust with respect to the wide variety of different probability
distributions one encounters in the real world...

-- Ben G









On Thu, Feb 20, 2014 at 1:01 PM, Piaget Modeler
<[email protected]>wrote:

> Hi all,
>
> For all you statisticians out there...
>
> I'm working on an algorithm for numeric similarity and would like to
> crowdsource the solution.
>
> Given two numbers, i.e., two observations, how can I get a score between
> -1 and 1 indicating their proximity.
>
> I think I need to compute a few things,
>
> 1. Compute the *mean* of the observations.
> 2. Compute the standard deviation *sigma* of the observations.
> 3. Compute the *z-score* of each number.
>
> Once I know the z-score for each number I knew where each number lies
> along the normal distribution.
>
> After that I'm a little lost.
>
> Is there a notion of difference or sameness after that.
>
> This might help..
>
>
> http://www.dkv.columbia.edu/demo/medical_errors_reporting/site010708/module3/0510-similar-numeric.html
>
> Your thoughts are appreciated ?
>
> Michael Miller.
>    *AGI* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/212726-deec6279> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com>
>



-- 
Ben Goertzel, PhD
http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James
T. Kirk



-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to