Ben G. What ratio of the numbers in the sample S are greater than max(x,y) or 
less than min(x,y) , as opposed to lying between x and y?
Thanks Ben, these are good suggestions.  I will consider.
~PM / Michael.

Date: Fri, 21 Feb 2014 07:46:17 +0800
Subject: Re: [agi] Numeric Similarity
From: [email protected]
To: [email protected]


Do you mean: Given two numbers x and and y drawn from a specific sample S of 
numbers (or a specific probability distribution D over the set of numbers)?   

Without this background S or D, the question is meaningless...
Given a distribution D, one can draw a sample S, of course; so the case where 
one has a sample S is sufficient to deal with

One sensible measure would be: What ratio of the numbers in the sample S are 
greater than max(x,y) or less than min(x,y) , as opposed to lying between x and 
y?

This gives you 1 if x and y are identical or have no members of S between them; 
and 0 if x and y are the opposite endpoints of the sample.
If you want a scaling between -1 and 1 instead of 0 and 1, just linearly 
normalize...

In OpenCog, this (but without the normalization into [-1,1]) is how we would 
measure similarity between two NumberNodes relative to a given 
QuantitativeSchemaNode, consistent with our approach of quantile normalization 
for predicatizing quantitative characters:

http://wiki.opencog.org/w/QuantitativePredicate

An advantage of ranking based approaches like these, is that they are robust 
with respect to the wide variety of different probability distributions one 
encounters in the real world...

-- Ben G




  



On Thu, Feb 20, 2014 at 1:01 PM, Piaget Modeler <[email protected]> 
wrote:




Hi all, 
For all you statisticians out there...
I'm working on an algorithm for numeric similarity and would like to 
crowdsource the solution.

Given two numbers, i.e., two observations, how can I get a score between -1 and 
1 indicating their proximity.
I think I need to compute a few things, 

1. Compute the mean of the observations.2. Compute the standard deviation sigma 
of the observations.3. Compute the z-score of each number. 

Once I know the z-score for each number I knew where each number lies along the 
normal distribution.
After that I'm a little lost.  
Is there a notion of difference or sameness after that. 

This might help..
http://www.dkv.columbia.edu/demo/medical_errors_reporting/site010708/module3/0510-similar-numeric.html

Your thoughts are appreciated ? 
Michael Miller.                                           


  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  





-- 
Ben Goertzel, PhD
http://goertzel.org

"In an insane world, the sane man must appear to be insane". -- Capt. James T. 
Kirk




  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to