As, I tried to explain, PM, there's no final measure of similarity, even between two integers. Integers can be interrelated | compared via potentially infinite number of inverse arithmetic operations, each of which gives you match (similarity) & miss (difference). I can only give you a starting point, & a way to proceed from there: The simplest comparison is be subtraction, which gives you the simplest absolute match: smaller comparand... Your request to quantify similarity between 1 & -1 is arbitrary, - resolution of match is subset of resolution of comparands, which you didn't define. But what really matters is a relative match: absolute match compared to a higher-level average match. That average is feedback down the hierarchy of search (I know you're having difficulty with that concept). And for lists it's even more complex, - they *consist* of numbers. So, the problem of quantifying similarity is ultimately *the* problem of GI, & you shouldn't expect a simple answer to that.
On Thu, Feb 20, 2014 at 11:57 AM, Piaget Modeler <[email protected]>wrote: > Thanks for your response Boris. > > My aim at the moment is to define a function for any two numbers a b. > > Similarity(a, b) ::= c | c in [-1 .. +1]. > > Examples: > > Similarity(0, 0) = 1.0 > > Similarity(239420, 239420) = 1.0 > > Similarity(3.1415926, 3.14) = 0.9995 /* or something close to but > less than one */ > > Similarity(-7123456789098765, -7123456789098765) = 1.0 > > And so forth. > > > From it I gather, your suggestion, not algorithm, is > > *"initial comparison between integers is by subtraction, which compresses > miss from !AND to difference by cancelling opposite-sign bits, & increases > match because it's a complimentary of that reduced difference.* > > *Division will further reduce magnitude of miss by converting it from > difference to ratio, which can then be reduced again by converting it to > logarithm, & so on. By reducing miss, higher power of comparison will also > increase complimentary match. But the costs may grow even faster, for both > operations & incremental syntax to record incidental sign, fraction, & > irrational fraction. The power of comparison is increased if current-power > match plus miss predict an improvement, as indicated by higher-order > comparison between results from different powers of comparison. Such > "meta-comparison" can discover algorithms, or meta-patterns."* > > Similarity(number a, number b) ::= log( (a-b) / ????) > > This seems a bit confusing for me. > > Your thoughts? > > ~PM. > > ------------------------------ > Date: Thu, 20 Feb 2014 09:23:47 -0500 > Subject: Re: [agi] Numeric Similarity > From: [email protected] > To: [email protected] > > > You finally got to a right starting point. This is covered in part 2 of > my intro: http://www.cognitivealgorithm.info/ > > *2. Comparison: quantifying match & miss per input.* > > The purpose of cognition is to predict, & prediction must be quantified. > Algorithmic information theory defines predictability as compressibility of > representations, which is perfectly fine. However, current implementations > of AIT quantify compression only for whole sequences of inputs. > To enable far more incremental selection (& correspondingly scalable > search), I start by quantifying match between individual inputs. Partial > match is a new dimension of analysis, additive to binary same | different > distinction of probabilistic inference. This is analogous to the way > probabilistic inference improved on classical logic by quantifying partial > probability of statements, vs binary true | false values. > > Individual partial match is compression of magnitude, by replacing larger > comparand with its difference relative to smaller comparand. In other > words, match is a complementary of miss, initially equal to the smaller > comparand. Ultimate criterion is recorded magnitude, rather than record > space: bits of memory it occupies after compression, because the former > represents physical impact that we want to predict. > > This definition is tautological: smaller comparand = sum of Boolean AND > between uncompressed (unary code) representations of both comparands, = > partial identity of these comparands. Some may object that identity also > includes the case when both comparands or bits thereof equal zero, but that > identity also equals zero. Again, the purpose here is prediction, which is > a representational equivalent of conservation in physics. We're predicting > some potential impact on the observer, represented by an input. Zero input > ultimately means zero impact, which has no conservable physical value > (inertia), thus no intrinsic predictive value. > > Given incremental complexity of representation, initial inputs should have > binary resolution. However, average binary match won't justify the cost of > comparison: syntactic overhead of representing new match & miss between > positionally distinct inputs. Rather, these binary inputs are compressed by > digitization within a position (coordinate): substitution of every two > lower-order bits with one higher-order bit within an integer. Resolution of > that coordinate (input aggregation span) is adjusted to form integers > sufficiently large to produce (when compared) average match that exceeds > above-mentioned costs of comparison. These are "opportunity costs": a > longer-range average match discovered by equivalent computational resources. > > So, the next order of compression is comparison across coordinates, > initially defined with binary resolution as before | after input. Any > comparison is an inverse arithmetic operation of incremental power: Boolean > AND, subtraction, division, logarithm, & so on. Actually, since > digitization already compressed inputs by AND, comparison of that power > won't further compress resulting integers. In general, match is *additive* > compression, achieved only by comparison of a higher power than that which > produced the comparands. Thus, initial comparison between integers is by > subtraction, which compresses miss from !AND to difference by cancelling > opposite-sign bits, & increases match because it's a complimentary of that > reduced difference. > > Division will further reduce magnitude of miss by converting it from > difference to ratio, which can then be reduced again by converting it to > logarithm, & so on. By reducing miss, higher power of comparison will also > increase complimentary match. But the costs may grow even faster, for both > operations & incremental syntax to record incidental sign, fraction, & > irrational fraction. The power of comparison is increased if current-power > match plus miss predict an improvement, as indicated by higher-order > comparison between results from different powers of comparison. Such > "meta-comparison" can discover algorithms, or meta-patterns. > > > > On Thu, Feb 20, 2014 at 12:01 AM, Piaget Modeler < > [email protected]> wrote: > > Hi all, > > For all you statisticians out there... > > I'm working on an algorithm for numeric similarity and would like to > crowdsource the solution. > > Given two numbers, i.e., two observations, how can I get a score between > -1 and 1 indicating their proximity. > > I think I need to compute a few things, > > 1. Compute the *mean* of the observations. > 2. Compute the standard deviation *sigma* of the observations. > 3. Compute the *z-score* of each number. > > Once I know the z-score for each number I knew where each number lies > along the normal distribution. > > After that I'm a little lost. > > Is there a notion of difference or sameness after that. > > This might help.. > > > http://www.dkv.columbia.edu/demo/medical_errors_reporting/site010708/module3/0510-similar-numeric.html > > Your thoughts are appreciated ? > > Michael Miller. > *AGI* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/18407320-d9907b69> | > Modify <https://www.listbox.com/member/?&> Your Subscription > <http://www.listbox.com> > > > *AGI* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/19999924-4a978ccc> | > Modify <https://www.listbox.com/member/?&> Your Subscription > <http://www.listbox.com> > *AGI* | Archives <https://www.listbox.com/member/archive/303/=now> > <https://www.listbox.com/member/archive/rss/303/18407320-d9907b69> | > Modify<https://www.listbox.com/member/?&>Your Subscription > <http://www.listbox.com> > ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
