Jim: Probabilistic reasoning has to be based on a frame of reference which defines, implicitly or explicitly, the necessary characteristics of what is being evaluated.
Boris: Right, but "defines" above may mean two things: modalities of original inputs that it measures, & the types of relationships among these inputs that it can discover: derivatives in my terms. Defining right modalities is important, but we already know that vision accounts for most of our data, & can indirectly discover all other modalities. The purpose of a GI algorithm is to discover relationships, & I suggest that they all can be reduced to / derived from atomic match & miss. Jim: So I happily agree with you. Just the fact that someone can act like he understands this seems like a novel experience to me. Boris: Same here. > Boris: In reality, expectations are rarely matched or missed precisely, so > the degree of confirmation must be quantified for individual events. > Quantifying partial match would add a micro-grayscale to the binary value of > events in Bayesian prediction, just like the latter added macro-grayscale > (partial probability) to binary (true| false) predictions of classical logic. Jim: Right, but you also have to define how these confirmations may be confirmed. Boris: "Confirmation" is quantified as match. And of course, on higher levels of search you will have match of a match, & so on. It's the same algorithm, applied to incrementally higher-syntax data. > Boris: Besides, the events are assumed to be high-level concepts, the kind > that occupy our conscious minds. But a scalable search algorithm must start > from sensory data processing that is subconscious for us, rather than depend > on human preprocessing. So, the choice of such initial inputs for BI & AIT > already shows a total lack of discipline in incrementing complexity: a fatal > fault for any attempt at scalability." Jim: I think you are missing something subtle. You cannot build knowledge on sensory data alone. You must also rely on the meaning that can be derived from that data. Boris: Of course, - my approach is hierarchical. Higher levels do "rely" on "meaning" derived on the lower levels. I think I already mentioned "incremental syntax" about a hundred times. Jim: This entails a kind of jumping around which can be likened to correlation (in its broadest sense). You are trying to solve for meaning by ruling it out as a useful input into reasoning. You may try defining raw sensory data processing as perception or pre-perception but these sub-categories can become illusions when you are wishing to understand how meaning is derived. Boris: Enlighten me, how? Jim: There is too much sensory data for it to be used as the direct basis for all insight, the data is noisy compared to what is seen as important, and there are few one-to-one associations between elementary sensation and insightful meaning. Boris: Cognition that produced our "meanings" is a long process of both individual, & especially civilization-wide, learning. I know it's hard to "visualize" how we got from something so simple & noisy, to all the science, technology, & social interactions of modern civilization. But our ability to learn is obviously innate, we only have 23K genes, & almost all of them have nothing to do with GI per se. Cavemen started with simple inputs & simple algorithms, all the rest must've been derived from them. Jim: By the way, I don't really agree that probabilistic / information theory methods can be used as the basis of AGI. Boris: Right, you want to use "semantics" & "meaning", but don't really know what that is. If you think about where we ultimately get these things from, - it must be either original modalities, or the cognitive algorithm itself. Jim: However, I am interested in what you are saying and I am very curious about some of the other details that you have mentioned so I do want to talk about this some more. Boris: I am all ears. http://www.cognitivealgorithm.info/2012/01/cognitive-algorithm.html Boris: Bayesian (probabilistic) inference should be built on evaluation of similarity (partial match) between individual inputs. Jim: Probabilistic reasoning has to be based on a frame of reference which defines, implicitly or explicitly, the necessary characteristics of what is being evaluated. So I happily agree with you. Just the fact that someone can act like he understands this seems like a novel experience to me. Boris: In reality, expectations are rarely matched or missed precisely, so the degree of confirmation must be quantified for individual events. Quantifying partial match would add a micro-grayscale to the binary value of events in Bayesian prediction, just like the latter added macro-grayscale (partial probability) to binary (true| false) predictions of classical logic. Jim: Right, but you also have to define how these confirmations may be confirmed. Boris: I define match for comparisons among individual inputs. ...a scalable search algorithm must start from sensory data processing... Jim: This is ok, but, Boris: Besides, the events are assumed to be high-level concepts, the kind that occupy our conscious minds. But a scalable search algorithm must start from sensory data processing that is subconscious for us, rather than depend on human preprocessing. So, the choice of such initial inputs for BI & AIT already shows a total lack of discipline in incrementing complexity: a fatal fault for any attempt at scalability." Jim: I think you are missing something subtle. You cannot build knowledge on sensory data alone. You must also rely on the meaning that can be derived from that data. This entails a kind of jumping around which can be likened to correlation (in its broadest sense). You are trying to solve for meaning by ruling it out as a useful input into reasoning. You may try defining raw sensory data processing as perception or pre-perception but these sub-categories can become illusions when you are wishing to understand how meaning is derived. There is too much sensory data for it to be used as the direct basis for all insight, the data is noisy compared to what is seen as important, and there are few one-to-one associations between elementary sensation and insightful meaning. By the way, I don't really agree that probabilistic / information theory methods can be used as the basis of AGI. However, I am interested in what you are saying and I am very curious about some of the other details that you have mentioned so I do want to talk about this some more. Jim Bromer On Sat, Aug 11, 2012 at 9:41 PM, Boris Kazachenko <bori...@verizon.net> wrote: Boris: "AIT quantifies compression for sequences of inputs, while I define match for comparisons among individual inputs. On this level, a match is a lossless compression by replacing a larger comparand with its derivative (miss), relative to the smaller comparand. In other words, a match a complementary of a miss. That’s a deeper level of analysis, which I think can enable a far more incremental (thus potentially scalable) approach. Jim: You are talking about an evaluation method that is derived from (or built on the scaffolding of) Bayesian Reasoning right? Boris: No, it's the other way around, Bayesian (probabilistic) inference should be built on evaluation of similarity (partial match) between individual inputs. The fact that it isn't is (to me) a fatal flaw of the former. Any probability is estimated from (& for) a sequence of instances, quantifying partial match (vs. assuming binary presence | absence) for each instance increases the depth of analysis by a whole new dimension. My intro, part 7: "Two other approaches close to mine are Algorithmic information theory & Bayesian inference, which use the same criteria as mine: compression & prediction. A good introduction is Philosophical Treatise of Universal Induction by S. Rathmanner & M. Hutter. While a progress vs. a static “frequentist” probability, BI & AIT still assume a “prior”, which doesn’t belong in a consistently inductive approach. To generalize it, Solomonoff introduced universal prior: “a class of all models“. A priori infinity of this class means that he hits combinatorial explosion even *before* receiving actual inputs, - “solution” that only a mathematician may find interesting. In my approach, the models are simply past inputs & correlations among them. Environmentally specific priors could speed-up learning, but a general pattern discovery algorithm must be the core on which such short-cuts are added or removed from. Also perverse is binary resolution of initial inputs in BI & AIT: confirmation / disconfirmation events. In reality, expectations are rarely matched or missed precisely, so the degree of confirmation must be quantified for individual events. Quantifying partial match would add a micro-grayscale to the binary value of events in Bayesian prediction, just like the latter added macro-grayscale (partial probability) to binary (true| false) predictions of classical logic. Besides, the events are assumed to be high-level concepts, the kind that occupy our conscious minds. But a scalable search algorithm must start from sensory data processing that is subconscious for us, rather than depend on human preprocessing. So, the choice of such initial inputs for BI & AIT already shows a total lack of discipline in incrementing complexity: a fatal fault for any attempt at scalability." On Wed, Aug 8, 2012 at 10:21 AM, Boris Kazachenko <bori...@verizon.net> wrote: Jim, I agree with your focus on binary computational compression, but, as you said, that efficiency depends on specific operands. Even though low-power operations (addition) are more efficient for most data, it's the exceptions that matter. Most data is noise, what we care about is patterns. So, to improve both representational & computational compression, we need to quantify it for each operand ) operation. And the atomic operation that quantifies compression is what I call comparison, which starts with an inverse, vs. direct arithmetic operation. This reflects on our basic disagreement, - you (& most logicians, mathematicians, & programmers) start from deduction / pattern projection, which is based on direct operations. And I think real GI must start from induction / pattern discovery, which is intrinsically an inverse operation. It's pretty dumb to generate / project patterns at random, vs. first discovering them in the real world & projecting accordingly. This is how I proposed to quantify compression (pattern strength) in my intro, part 2: "AIT quantifies compression for sequences of inputs, while I define match for comparisons among individual inputs. On this level, a match is a lossless compression by replacing a larger comparand with its derivative (miss), relative to the smaller comparand. In other words, a match a complementary of a miss. That’s a deeper level of analysis, which I think can enable a far more incremental (thus potentially scalable) approach. Given incremental complexity of representation, initial inputs should have binary resolution. However, average binary match won’t justify the cost of comparison, which adds a syntactic overhead of newly differentiated match & miss to positionally distinct inputs. Rather, these binary inputs are compressed by digitization: a selective carry, aggregated & then forwarded up the hierarchy of digits. This is analogous to hierarchical search, explained in the next chapter, where selected templates are compared & conditionally forwarded up the hierarchy of expansion levels: a “digital hierarchy” of a corresponding coordinate. Digitization is done on inputs within a shared coordinate, the resolution of which is adjusted by feedback. This resolution must form average integers that are large enough for an average match between them (a subset of their magnitude) to merit the above-mentioned costs of comparison. Hence, the next order of compression is comparison across coordinates (initially defined with binary resolution as before | after input). Any comparison is an inverse arithmetic operation of incremental power: Boolean AND, subtraction, division, logarithm, & so on. Binary match is a sum of AND: partial identity of uncompressed bit strings, & miss is !AND. Binary comparison is useful for digitization, but it won’t further compress the integers produced thereby. In general, the products of a given-power comparison are further compressed only by a higher-power comparison between them, where match is the *additive* compression. Thus, initial comparison between digitized integers is done by subtraction, which increases match by compressing miss from !AND to difference, in which opposite-sign bits cancel each other via carry | borrow. The match is increased because it is a complimentary of difference, equal to the smaller of the comparands. All-to-all comparison across 1D queue of pixels forms signed derivatives, complemented by which new inputs can losslessly & compressively replace older templates. At the same time, current input match determines whether individual derivatives are also compared (vs. aggregated), forming successively higher derivatives. “Atomic” comparison is between a single-variable input & a template (older input): Comparison: match= min (input, template), miss= dif (i-t): aggregated over the span of constant sign. Evaluation: match - average_match_per_average_difference_match, formed on the next search level. This evaluation is for comparing higher derivatives, vs. evaluation for higher-level inputs explained in part 3. It can also be increasingly complex, but I will need a meaningful feedback to elaborate. Division further reduces difference to a ratio, which can then be reduced to a logarithm, & so on. Thus, complimentary match is increased with the power of comparison. But the costs may grow even faster, for both operations & incremental syntax to record incidental sign, fraction, irrational fraction. The power of comparison is increased if current match plus miss predict an improvement, as indicated by higher-order comparison between the results from different powers of comparison. This meta-comparison can discover algorithms, or meta-patterns..." http://www.cognitivealgorithm.info/2012/01/cognitive-algorithm.html AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-c97d2393 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-2484a968 Powered by Listbox: http://www.listbox.com