AIXItl; Wolfram's hypothesis (was Re: [agi] How valuable is Solmononoff Induction for real world AGI?)

2007-11-10 Thread Tim Freeman
From: Lukasz Stafiniak [EMAIL PROTECTED]
The programs are generally required to exactly match in AIXI (but not
in AIXItl I think).

I'm pretty sure AIXItl wants an exact match too.  There isn't anything
there that lets the theoretical AI guess probability distributions and
then get scored based on how probable the actual world is according to
that distribution -- each hypothesis is either right or wrong, and
wrong hypotheses are discarded.

The reference I use for AIXItl is:

http://www.hutter1.net/ai/aixigentle.htm

On Nov 9, 2007 5:26 AM, Edward W. Porter [EMAIL PROTECTED] wrote:
 are these short codes sort of like Wolfram little codelettes,
 that can hopefully represent complex patterns out of very little code, or do
 they pretty much represent subsets of visual patterns as small bit maps.

From: Lukasz Stafiniak [EMAIL PROTECTED]
It depends on reality, whether the reality supports Wolfram's hypothesis.

I'm guessing you mean the Priniciple of Computational Equivalence,
as defined at:

   http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html

He's saying that 'systems found in the natural world can perform
computations up to a maximal (universal) level of computational
power'.  All the AIXI family needs to be near-optimal is for the
probability distribution of possible outcomes to be computable.  I
couldn't quickly tell whether Wolfram is saying that the actual
outcomes are computable, or just the probabilities of the outcomes.

-- 
Tim Freeman   http://www.fungible.com   [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63844010-625f39


Re: AIXItl; Wolfram's hypothesis (was Re: [agi] How valuable is Solmononoff Induction for real world AGI?)

2007-11-10 Thread Lukasz Stafiniak
On Nov 10, 2007 4:47 PM, Tim Freeman [EMAIL PROTECTED] wrote:
 From: Lukasz Stafiniak [EMAIL PROTECTED]
 The programs are generally required to exactly match in AIXI (but not
 in AIXItl I think).

 I'm pretty sure AIXItl wants an exact match too.  There isn't anything
 there that lets the theoretical AI guess probability distributions and
 then get scored based on how probable the actual world is according to
 that distribution -- each hypothesis is either right or wrong, and
 wrong hypotheses are discarded.

I agree that I misinterpreted the meaning of exact match.
AIXItl uses strategies whose outputs do not need to agree with history.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63846012-5d1170


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Lukasz Stafiniak
On Nov 9, 2007 5:26 AM, Edward W. Porter [EMAIL PROTECTED] wrote:
 ED ## what is the value or advantage of conditional complexities
 relative to conditional probabilities?

Kolmogorov complexity is universal. For probabilities, you need to
specify the probability space and initial distribution over this
space.

 ED ## What's a TM?
(Turing Machine, or a code for a universal Turing Machine = a program...)

 Also are you saying that the system would develop programs for matching
 patterns, and then patterns for modifying those patterns, etc, So that
 similar patterns would be matched by programs that called a routine for a
 common pattern, but then other patterns to modify them to fit different
 perceptions?

Yes, these programs will be compact description od data when enough
data gets collected, so their (posterior) probability will grow with
time. But the most probable programs will be very cryptic, without
redundancy to make the structure evident.

 So are the programs just used for computing Kolmogorov complexity or are
 they also used for generating and matching patterns.

It is difficult to say: in AIXI, the direct operation is governed by
the expectimax algorithm, but the algorithm works in future (is
derived from the Solomonoff predictor). Hutter mentions alternative
model AIXI_alt, which models actions the same way as the
environment...

 Does it require that the programs exactly match a current pattern being
 received, or does it know when a match is good enough that it can be relied
 upon as having some significance?

It is automatic: when you have a program with a good enough match,
then you can parameterize it over the difference and apply twice,
thus saving the code. Remember that the programs need to represent the
whole history.

 Can the programs learn that similar but different patterns are different
 views of the same thing?
 Can they learn a generalizational and compositional hierarchy of patterns?

With an egzegetic enough interpretation...

I will comment on further questions in a few hours.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63453551-e3704c


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Edward W. Porter
Thank you for your reply.  I want to take some time and compare this with
the reply I got from Shane Legg and get back to you when I have more time
to think about it.

Edward W. Porter
Porter  Associates
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]



-Original Message-
From: Lukasz Stafiniak [mailto:[EMAIL PROTECTED]
Sent: Friday, November 09, 2007 7:13 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On Nov 9, 2007 5:26 AM, Edward W. Porter [EMAIL PROTECTED] wrote:
 ED ## what is the value or advantage of conditional complexities
 relative to conditional probabilities?

Kolmogorov complexity is universal. For probabilities, you need to
specify the probability space and initial distribution over this space.

 ED ## What's a TM?
(Turing Machine, or a code for a universal Turing Machine = a program...)

 Also are you saying that the system would develop programs for
 matching patterns, and then patterns for modifying those patterns,
 etc, So that similar patterns would be matched by programs that called
 a routine for a common pattern, but then other patterns to modify them
 to fit different perceptions?

Yes, these programs will be compact description od data when enough data
gets collected, so their (posterior) probability will grow with time. But
the most probable programs will be very cryptic, without redundancy to
make the structure evident.

 So are the programs just used for computing Kolmogorov complexity or
 are they also used for generating and matching patterns.

It is difficult to say: in AIXI, the direct operation is governed by the
expectimax algorithm, but the algorithm works in future (is derived from
the Solomonoff predictor). Hutter mentions alternative model AIXI_alt,
which models actions the same way as the environment...

 Does it require that the programs exactly match a current pattern
 being received, or does it know when a match is good enough that it
 can be relied upon as having some significance?

It is automatic: when you have a program with a good enough match, then
you can parameterize it over the difference and apply twice, thus saving
the code. Remember that the programs need to represent the whole history.

 Can the programs learn that similar but different patterns are
 different views of the same thing? Can they learn a generalizational
 and compositional hierarchy of patterns?

With an egzegetic enough interpretation...

I will comment on further questions in a few hours.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63505719-0476eb


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Lukasz Stafiniak
On Nov 9, 2007 5:26 AM, Edward W. Porter [EMAIL PROTECTED] wrote:

 So are the programs just used for computing Kolmogorov complexity or are
 they also used for generating and matching patterns.

The programs do not compute K complexity, they (their length) _are_ (a
variant of) Kolmogorov complexity. The programs compute (predict) the
environment.

 Does it require that the programs exactly match a current pattern being
 received, or does it know when a match is good enough that it can be relied
 upon as having some significance?

The programs are generally required to exactly match in AIXI (but not
in AIXItl I think). But the significance is provided by the
compression on representation of similar things, which favors the same
sort of similarity in the future.

 Can they run on massively parallel processing.

I think they can... In AIXI, you would build a summation tree for the
posterior probability.

 The Hutters expectimax tree appears to alternate levels of selection and
 evaluation.   Can the Expectimax tree run in reverse and in parallel, with
 information coming up from low sensory levels, and then being selected based
 on their relative probability, and then having the selected lower level
 patterns being fed as inputs into higher level patterns and then repeating
 that process.  That would be a hierarchy that alternates matching and then
 selecting the best scoring match at alternate levels of the hierarchy as is
 shown in the Serre article I have cited so many times before on this list.

To be optimal, the expectimax must be performed chronologically from
the end of the horizon (dynamic programming principle: close to the
end of the time horizon, you have smaller planning problems -- less
opportunities; from smaller solutions to smaller problems you build
bigger solutions backwards in time). But the probabilities are
conditional on all current history including low sensory levels.

(Generally, your comment above doesn't make much sense in the AIXI context.)

 ED## are these short codes sort of like Wolfram little codelettes,
 that can hopefully represent complex patterns out of very little code, or do
 they pretty much represent subsets of visual patterns as small bit maps.

It depends on reality, whether the reality supports Wolfram's hypothesis.

Best Regards.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63539823-b308a9


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Edward W. Porter
Jeff,

(to make it easier to know who is responding to whom, if any of this is
cut into postings by others I have inserted a “” before “JEF ##” to
indicate his comments occurred first in time.)

JEF ## Edward,  can you explain what you might have meant by based
on the likelihood that the probability...?

ED ## I think my statement --  “Dragon selected speech recognition
word candidates based on the likelihood that the probability distribution
of their model matched the acoustic evidence” -- maps directly into your
statement that -- “likelihood is simply the probability of some data.”

The probability of some data given a model’s probability distribution
can, I think, be properly considered a match between the distribution of
the data and the distribution of the model.  Maybe in Jef-speak that is
not a proper use of the word “match”, but I think in normal parlance, even
in computer science it is.  Correct me if I am wrong.

Remember at Dragon we were scoring multiple different word models’
probability distributions against the acoustic data, and those scores were
considered to indicate a degree of match between the model and the data.

Jef ## ’Given all the relevant parameters’ is key, and implies
objectivity.  Without all the relevant parameters of the likelihood
function, you are left with probability, which is inherently subjective.
When you said based on the likelihood that the probability, it seemed
that you were somehow (?) confusing the subjective with the objective,
which in my opinion, is a theme running through this entire thread.”

ED ## According to the above statement all likelihood functions that
are computable are subjective, and thus according to your definition just
“probabilities.”  This is because it is impossible for a computable
likelihood function to include all possibly relevant parameters.  No
computable system knows enough about the world to know what the relevant
parameters are.  There always could be an, as yet, un-modeled glitch in
the Matrix.  Thus, your implication that I had somehow confused the
correct definition of likelihood, which would have it be “objective”, with
one that was subjective (because it did not use all relevant parameters),
would seem to be a crime committed by any person who has ever talked about
the actual likelihood calculations (which would include a majority of the
people in the field).

Again my offense seems to be using words as most in the field do, rather
than in strick adherence to Jef-speak.

Jef ## How does this map onto your difficulty grasping the
significance of Solomonoff induction? Solomonoff induction is an idealized
description of learning by a subjective agent interacting with an
objective  (actually consistent might be more accurate here) reality.

ED ##  Finally, I am learning what our whole back and forth has been
about.  I wish our correspondence had included more sentences like this
earlier on.

But if I am guilty of using likelihoods in a way that sullies them by
making them subjective, how does that make them any worse than Solomonoff
induction?  According to the above isn’t it is guilty of the same lack of
purity because it is describing learning by a “subjective” agent.

Or are you are claiming Solomonoff induction is an objective description
of a subjective thing?  Words are often stretched so far (although I
thought not in Jef-speak).

But if Solomonoff induction is based on generalizations assuming knowledge
about things it can never know, how is it that any less “subjective” than
a likelihood function calculated without all relevant parameters?  Does
pretending we know everything about reality make our understanding of it
any less subjective?

Pretending can allow some useful thought experiments, but are they
objective?

Are mathematical proofs objective?.  How do we know they are based on all
the relevant parameters?

Isn't math just a creation in our heads, and thus subjective?  Yes,
scientific evidence suggests it describes some real things in the real
world, really well, but that is all based on sensation, and that,
according to you is subjective.


Ed Porter

P.S. Since your hobby is collecting paradoxes, if you have a few that are
either particularly insightful or amusing (and hopefully only a sentence
or two long), please feel free to share.

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Friday, November 09, 2007 2:46 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:

 ED  Most importantly you say my alleged confusion between
 subjective and objective maps into my difficulty to grasp the
 significance of Solomonoff induction. If you could do so, please
 explain what you mean.

Given our significantly disjoint backgrounds, the best I hoped for was to
point out where you're not going to get a good answer because you're not
asking a good question

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
I recently found this paper to contain some thinking worthwhile to the
considerations in this thread.

http://lcsd05.cs.tamu.edu/papers/veldhuizen.pdf

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62868749-8ed517


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Benjamin Goertzel



 Is there any research that can tell us what kind of structures are better
 for machine learning?  Or perhaps w.r.t a certain type of data?  Are there
 learning structures that will somehow learn things faster?


There is plenty of knowledge about which learning algorithms are better for
which problem classes.

For example, there are problems known to be deceptive (not efficiently
solved) for genetic programming, yet that are known to be efficiently
solvable by MOSES, the probabilistic program learning method used in
Novamente (from Moshe Looks' PhD thesis, see metacog.org)





 Note that, if the answer is negative, then the choice of learning
 structures is arbitrary and we should choose the most developed / heavily
 researched ones (such as first-order logic).



The choice is not at all arbitrary; but the knowledge we have to guide the
choice is currently very incomplete.  So one has to make the right intuitive
choice based on integrating the available information.  This is part of why
AGI is hard at the current level of development of computer science.


-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62863431-4f3cc5

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread YKY (Yan King Yin)
My impression is that most machine learning theories assume a search space
of hypotheses as a given, so it is out of their scope to compare *between*
learning structures (eg, between logic and neural networks).

Algorithmic learning theory - I don't know much about it - may be useful
because it does not assume a priori a learning structure (except that of a
Turing machine), but then the algorithmic complexity is incomputable.

Is there any research that can tell us what kind of structures are better
for machine learning?  Or perhaps w.r.t a certain type of data?  Are there
learning structures that will somehow learn things faster?

Note that, if the answer is negative, then the choice of learning structures
is arbitrary and we should choose the most developed / heavily researched
ones (such as first-order logic).

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62846736-33363c

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread YKY (Yan King Yin)
Thanks for the input.

There's one perplexing theorem, in the paper about the algorithmic
complexity of programming, that the language doesn't matter that much, ie,
the algorithmic complexity of a program in different languages only differ
by a constant.  I've heard something similar about the choice of Turing
machines only affect the Kolmogorov complexity by a constant.
(I'll check out the proof of this one later.)

But it seems to suggest that the choice of the AGI's KR doesn't matter.  It
can be logic, neural network, or java?  That's kind of
a strange conclusion...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62888479-02c7e4

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:

 My impression is that most machine learning theories assume a search space
 of hypotheses as a given, so it is out of their scope to compare *between*
 learning structures (eg, between logic and neural networks).

 Algorithmic learning theory - I don't know much about it - may be useful
 because it does not assume a priori a learning structure (except that of a
 Turing machine), but then the algorithmic complexity is incomputable.

 Is there any research that can tell us what kind of structures are better
 for machine learning?

Not if all problems are equi-probable.
http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization

However this is unlikely in the real world.

It does however give an important lesson, put as much information as
you have about the problem domain into the algorithm and
representation as possible, if you want to be at all efficient.

This form of learning is only a very small part of what humans do when
we learn things. For example when we learn to play chess, we are told
or read the rules of chess and the winning conditions. This allows us
to create tentative learning strategies/algorithms that are much
better than random at playing the game and also giving us good
information about the game. Which is how we generally deal with
combinatorial explosions.

Consider a probabilistic learning system based on statements about the
real world TM, without this ability to alter how it learns and what it
tries, it would be looking at the probability of whether a bird
tweeting is correlated with his opponent winning, and also trying to
figure out whether emptying an ink well over the board is a valid
move.

I think Marcus Hutter has a bit about how slow AIXI would be at
learning chess somewhere in writings, due to only getting a small
amounts of information (1 bit ?) per game about the problem domain. My
memory might be faulty and I don't have time to dig at the moment

  Or perhaps w.r.t a certain type of data?  Are there
 learning structures that will somehow learn things faster?

Thinking in terms of fixed learning structures is IMO a mistake.
Interstingly AIXI doesn't have fixed learning structures per se, even
though it might appear to. Because it stores the entire history of the
agent and feeds it to each program under evaluation, each of these may
be a learning program and be able to create learning strategies from
that data. You would have to wait a long time for these types of
programs to become the most probable if a good prior was not given to
the system though.


 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62882969-3d3172


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID

VLADIMIR “Hutter shows that prior can be selected rather arbitrarily
without giving up too much”

ED Yes.  I was wondering why the Solomonoff Induction paper made such
a big stink about picking the prior (and then came up which a choice that
struck me as being quite sub-optimal in most of the types of situations
humans deal with).  After you have a lot of data, you can derive the
equivalent of the prior from frequency data.  As the Solmononoff Induction
paper showed, using Bayesian formulas the effect of the prior fades off
fairly fast as data comes in.

(However, I have read that for complex probability distributions the
choice of the class of mathematical model you use to model the
distribution is part of the prior choosing issue, and can be important —
but that did not seem to be addressed in the Solomonoff Induction paper.
For example in some speech recognition each of the each speech frame model
has a pre-selected number of dimensions, such as FFT bins (or related
signal processing derivatives), and each dimension is not represented by a
Gausian but rather by a basis function comprised of a set of a selected
number of Gausians.)

It seems to me that when you don’t have much frequency data, we humans
normally make a guess based on the probability of similar things, as
suggested in the Kemp paper I cited.It seems to me that is by far the
most commonsensical approach.  In fact, due to the virtual omnipreseance
of non-literal similarity in everything we see and hear, (e.g., the same
face virtually never hits V1 exactly the same) most of our probabilistic
thinking is dominated by similarity derived probabilities.

BEN GOERTZEL WROTE IN HIS Thu 11/8/2007 6:32 AM POST

BEN [referring the Vlad’s statement that about AIXI’s
uncomputability]“Now now, it doesn't require infinite resources -- the
AIXItl variant of AIXI only requires an insanely massive amount of
resources, more than would be feasible in the physical universe, but not
an infinite amount ;-) “

ED So, from a practical standpoint, which is all I really care about,
is it a dead end?

Also, do you, or anybody know, if  Solmononoff (the only way I can
remember the name is “Soul man on off” like Otis Redding with a microphone
problem) Induction have the ability of deal with deep forms of non-literal
similarity matching in is complexity calculations.  And is so how?  And if
not, isn’t it brain dead?  And if it is a brain dead why is such a bright
guy as Shane Legg spending his time on it.


YAN KINK YIN IN HIS 11/8/2007 9:16 AM POST SAID

YAN Is there any research that can tell us what kind of structures are
better for machine learning?  Or perhaps w.r.t a certain type of data?
Are there learning structures that will somehow learn things faster?

ED  Yes, brain science.  It may not point out the best possible
architecture, but it points out one that works.  Evolution is not
theoretical, and not totally optimal, but it is practical.  Systems like
Novamente which is loosely based on many key ideas from brain science
probably have a much more likely chance of getting useful stuff up and
running soon that any more theortical approaches, because the search space
has already been narrowed by many trillions of trials and errors over
hundreds of millions of years.

Ed Porter

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62880190-75103d

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread John G. Rose
 From: Jef Allbright [mailto:[EMAIL PROTECTED]
 
 I recently found this paper to contain some thinking worthwhile to the
 considerations in this thread.
 
 http://lcsd05.cs.tamu.edu/papers/veldhuizen.pdf
 

This is an excellent paper not in only the subject of code reuse but also of
the techniques and tools used to tackle such a complicated issue. Code reuse
is related to code generation as some AGIs would make use of, or any other
type of language generation, formal or whatever.

John

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62909084-8cc6c9


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter

Jef,

The paper cited below is more relevant to Kolmogorov complexity than
Solomonoff induction.  I had thought about the use of subroutines before I
wrote my questioning critique of Solomonoff Induction.

Nothing in it seems to deal with the fact that the descriptive length of
reality’s computations that create an event (the descriptive length that
is more likely to affect the event’s probability), is not necessarily
correlated with the descriptive length of sensations we receive from such
events.  Nor is it clear that it deals with the fact that much of the
frequency data a world-sensing brain derives its probabilities from is
full of non-literal similarity, meaning that non-literal matching is a key
component of any capable AGI.  It does not indicate how the complexity of
that non-literal matching, at the sensation, rather than the reality
generating level, is to be dealt which by Solomonoff Indicution, is it
part of the complexity involved in its hypothesis (or semi-measurs) or
not, and to what if any extent should it be?

With regard to the paper you cited I disagree with its statement that the
measure of the complexity of a program written using a library should be
the size of the program and the size of the library is uses.  Presumably
this was a mis-statement, because it would make all but the very largest
programs that used the same vast library relatively close in size,
regardless of the relative complexity of what they do.  I assume it really
should be the length of the program plus only each of the library routines
it actually uses, independent of how many times it uses them. Anything
else would mean that

To make this discussion relevant to practical AGI, lets assume the program
from which Kolmogorov complexity is computed is a Novamente-class machine
up and running with world knowledge in say five to ten years.  Assume the
system has compositional and generalizational hierarchies providing it
with the representational efficiencies Jeff Hawkins describes for
hierarchical memory.

In such a system much of what determines what happens lies in its
knowledge base,  I assume the length of any knowledge base components used
would also have to be counted in the Kolmogorov complexity.

But would one only count the knowledge structures actually found to match,
or also the ones that were match candidates, but lost out, when
calculating such complexity?  Any ideas?

Ed Porter

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 9:56 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


I recently found this paper to contain some thinking worthwhile to the
considerations in this thread.

http://lcsd05.cs.tamu.edu/papers/veldhuizen.pdf

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62919265-6d3337


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, YKY (Yan King Yin) [EMAIL PROTECTED] wrote:

 Thanks for the input.

 There's one perplexing theorem, in the paper about the algorithmic
 complexity of programming, that the language doesn't matter that much, ie,
 the algorithmic complexity of a program in different languages only differ
 by a constant.  I've heard something similar about the choice of Turing
 machines only affect the Kolmogorov complexity by a constant.  (I'll check
 out the proof of this one later.)


This only works if the languages are are Turing Complete so that they
can append a description of a program that converts from the language
in question to its native one, in front of the non-native program.

Also constant might not mean negligable. 2^^^9 is a constant (where ^
is knuth's up arrow notation).

 But it seems to suggest that the choice of the AGI's KR doesn't matter.  It
 can be logic, neural network, or java?  That's kind of a strange
 conclusion...


Only some neural networks are Turing complete. First order logic
should be, prepositional logic not so much.

 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=62912284-88dadd


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, Jef Allbright [EMAIL PROTECTED] wrote:
 I'm sorry I'm not going to be able to provide much illumination for
 you at this time.  Just the few sentences of yours quoted above, while
 of a level of comprehension equal or better than average on this list,
 demonstrate epistemological incoherence to the extent I would hardly
 know where to begin.

 This discussion reminds me of hot rod enthusiasts arguing passionately
 about how to build the best racing car, while denigrating any
 discussion of entropy as outside the practical.


You are over stating the case majorly. Entropy can be used to make
predictions about chemical reactions and help design systems. UAI so
far has yet to prove its usefulness. It is just a mathematical
formalism that is incomplete in a number of ways.

1) Doesn't treat computation as outputting to the environment, thus
can have no concept of saving energy or avoiding inteference with
other systems by avoiding computation. The lack of energy saving means
it is not valid model for solving the problem of being a
non-reversible intelligence in an energy poor environment (which
humans are and most mobile robots will be).

2) It is based on Sequential Interaction Machines, rather than
Multi-Stream Interaction Machines, which means it might lose out on
expressiveness as talked about here.

http://www.cs.brown.edu/people/pw/papers/bcj1.pdf

It is the first step on an interesting path, but it is too divorced
from what computation actually is, for me to consider it equivalent to
the entropy of AI.

 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63155358-fb9c39


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Derek,



Thank you.



I think the list should be a place where people can debate and criticize
ideas, but I think such poorly reasoned and insulting flames like Jefs are
not helpful, particularly if they are driving potentially valuable
contributors like you off the list.



Luckily such flames are relatively rare.  So I think we should criticize
such flames when they do occur so there are fewer of them and try to have
tougher skin so that when they occur we don’t get too upset by them (and
so that we don’t ourselves get drawn into flame mode by tough but fair
criticisms).



I just re-read my email that sent Jef into such a tizzy.  Although it was
not hostile, it was not as tactful as it could have been.  In my attempt
to respond quickly I did not intended to attack him or his paper (other
than one apparent mis-statement in it), but rather to say it didn’t relate
to the issue I was specifically interested in.  I actually thought it was
quite an interesting article.  I wish in hindsight I had said so.  At the
end of the post that upset Jef I actually asked how the Kolmogorov
complexity measure his paper discussed might be applied to a given type of
AGI.  That was my attempt to acknowledge the importance of what the paper
dealt with.



Lets just hope going forward most people can take fair attempts to debate,
question, or attack their ideas without flipping out, and I guess we all
should spend an extra 5% more time in our posts trying to be tactful.



And lets just hope more people like you start contributing again.



Ed Porter




-Original Message-
From: Derek Zahn [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 3:05 PM
To: agi@v2.listbox.com
Subject: RE: [agi] How valuable is Solmononoff Induction for real world
AGI?


Edward,

For some reason, this list has become one of the most hostile and
poisonous discussion forums around.  I admire your determined effort to
hold substantive conversations here, and hope you continue.  Many of us
have simply given up.



  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
http://v2.listbox.com/member/?;
 

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63067563-8aefd0

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Jeff,

In your below flame you spent much more energy conveying contempt than
knowledge.  Since I don’t have time to respond to all of your attacks, let
us, for example, just look at the last two:

MY PRIOR POST ...affect the event's probability...

JEF’S PUT DOWN 1More coherently, you might restate this as ...reflect
the event's likelihood...

MY COMMENT At Dragon System, then one of the world’s leading speech
recognition companies, I was repeatedly told by our in-house PhD in
statistics that “likelihood” is the measure of a hypothesis matching, or
being supported by, evidence.  Dragon selected speech recognition word
candidates based on the likelihood that the probability distribution of
their model matched the acoustic evidence provided by an event, i.e., a
spoken utterance.

Similarly, if one is drawing balls from a bag that has a distribution of
black and white balls, the color of a ball produced by a given random
drawing from the bag has a probability based on the distribution in the
bag.  But one uses the likelihood function to calculate the probability
distribution in the bag, from among multiple possible distribution
hypothesis, by how well that distribution matches data produced by events.


The creation of the “event” by external reality I was referring to is much
more like the utterance of a word (and its observation) or the drawing of
ball (and the observation of its color) from a bag than a determination of
what probability distribution most likely created it.  On the otherhand,
trying to understand the complexity that gives rise to such an event would
be more like using likelihoods.

When I wrote the post you have flamed I was not worrying about trying to
be exact in my memo, because I had no idea people on this list wasted
their energy correcting such common inexact usages as switching
“probability” and “likelihood.”  But as it turns out in this case my
selection of the word “probability” rather than “likelihood” seems to be
totally correct.


MY PRIOR POST ...the descriptive length of sensations we receive...

JEF’S PUT DOWN 2 Who is this we that receives sensations?  Holy
homunculus, Batman, seems we have a bit of qualia confusion thrown into
the mix!

MY COMMENT Again I did not know that I would be attacked for using
such a common English usage as “we” on this list.  Am I to assume that
you, Jef, never use the words “we” or “I” because you are surrounded by
“friends” so kind as to rudely say “Holy homunculus, Batman” every time
you do.

Or, just perhaps, are you a little more normal than that.

In addition, the use of the word “we” or even “I” does not necessary imply
a homunculus.  I think most modern understanding of the brain indicates
that human consciousness is most probably -- although richly
interconnected -- a distributed computation that does not require a
homunculus.  I like and often use Bernard Baars’ Theater of Consciousness
metaphor.

But none of this means it is improper to use the words “we” or “I” when
referring to ourselves or our consciousnesses.

And I think one should be allowed to use the word “sensation” without
being accused of “qualia confusion.”  Jeff, do you ever use the word
“sensation,” or would that be too “confusing” for you?


So, Jeff, if Solomonoff induction is really a concept that can help me get
a more coherent model of reality, I would really appreciate someone who
had the understanding, intelligence, and friendliness to at least try in
relatively simple words to give me pointers as to how and why it is so
important, rather someone who picks apart every word I say with minute or
often incorrect criticisms.

A good example of they type of friendly effort I appreciate is in Lukasz
Stafiniak 11/08/07 11:54 AM post to me, which I have not had time yet to
fully understand, but which I greatly appreciate for its focused and
helpful approach.

Ed Porter

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 12:55 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:

 Jef,

 The paper cited below is more relevant to Kolmogorov complexity than
 Solomonoff induction.  I had thought about the use of subroutines
 before I wrote my questioning critique of Solomonoff Induction.

 Nothing in it seems to deal with the fact that the descriptive length
 of reality's computations that create an event (the descriptive length
 that is more likely to affect the event's probability), is not
 necessarily correlated with the descriptive length of sensations we
 receive from such events.

Edward -

I'm sorry I'm not going to be able to provide much illumination for you at
this time.  Just the few sentences of yours quoted above, while of a level
of comprehension equal or better than average on this list, demonstrate
epistemological incoherence to the extent I would hardly know where to
begin.

This discussion reminds

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Derek Zahn
Edward,
 
For some reason, this list has become one of the most hostile and poisonous 
discussion forums around.  I admire your determined effort to hold substantive 
conversations here, and hope you continue.  Many of us have simply given up.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63054363-e5048c

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Cool!


-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 12:56 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?




Yeah, we use Occam's razor heuristics in Novamente, and they are commonly
used throughout AI.  For instance in evolutionary program learning one
uses a parsimony pressure which automatically rates smaller program
trees as more fit...

ben


On Nov 8, 2007 12:21 PM, Edward W. Porter [EMAIL PROTECTED] wrote:


BEN However, the current form of AIXI-related math theory gives zero
guidance regarding how to make  a practical AGI.
ED Legg's Solomonoff Induction paper did suggest some down and dirty
hacks, such as Occam's razor.  It woud seem a Novamente-class machine
could do a quick backward chaining of preconditions and their
probabilities to guestimate probabilities.  That would be a rough function
of a complexity measure.  But actually it wold be something much better
because it would be concerned not only with the complexity of elements
and/or sub-events and their relationships but also so their probabilities
and that of their relationships.

Edward W. Porter
Porter  Associates


24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]





-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:52 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?



BEN [referring the Vlad's statement that about AIXI's
uncomputability]Now now, it doesn't require infinite resources -- the
AIXItl variant of AIXI only requires an insanely massive amount of
resources, more than would be feasible in the physical universe, but not
an infinite amount ;-) 

ED So, from a practical standpoint, which is all I really care about,
is it a dead end?


Dead end would be too strong IMO, though others might disagree.

However, the current form of AIXI-related math theory gives zero guidance
regarding how to make  a practical AGI.  To get practical guidance out of
that theory would require some additional, extremely profound math
breakthroughs, radically different in character from the theory as it
exists right now.  This could happen.  I'm not counting on it, and I've
decided not to spend time working on it personally, as fascinating as the
subject area is to me.




Also, do you, or anybody know, if  Solmononoff (the only way I can
remember the name is Soul man on off like Otis Redding with a microphone
problem) Induction have the ability of deal with deep forms of non-literal
similarity matching in is complexity calculations.  And is so how?  And if
not, isn't it brain dead?  And if it is a brain dead why is such a bright
guy as Shane Legg spending his time on it.


Solomonoff induction is mentally all-powerful.  But it requires infinitely
much computational resources to achieve this ubermentality.

-- Ben G

  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/? http://v2.listbox.com/member/?; 

  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:

http://v2.listbox.com/member/? http://v2.listbox.com/member/?; 


  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
http://v2.listbox.com/member/?;
 

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63016185-a53e2e

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:

 Jeff,

 In your below flame you spent much more energy conveying contempt than
 knowledge.

I'll readily apologize again for the ineffectiveness of my
presentation, but I meant no contempt.


 Since I don't have time to respond to all of your attacks,

Not attacks, but (overly) terse pointers to areas highlighting
difficulty in understanding the problem due to difficulty framing the
question.


 MY PRIOR POST ...affect the event's probability...

 JEF'S PUT DOWN 1More coherently, you might restate this as ...reflect
 the event's likelihood...

I (ineffectively) tried to highlight a thread of epistemic confusion
involving an abstract observer interacting with and learning from its
environment.  In your paragraph, I find it nearly impossible to find a
valid base from which to suggest improvements.  If I had acted more
wisely, I would have tried first to establish common ground
**outside** your statements and touched lightly and more
constructively on one or two points.


 MY COMMENT At Dragon System, then one of the world's leading speech
 recognition companies, I was repeatedly told by our in-house PhD in
 statistics that likelihood is the measure of a hypothesis matching, or
 being supported by, evidence.  Dragon selected speech recognition word
 candidates based on the likelihood that the probability distribution of
 their model matched the acoustic evidence provided by an event, i.e., a
 spoken utterance.

If you said Dragon selected word candidates based on their probability
distribution relative to the likelihood function supported by the
evidence provided by acoustic events I'd be with you there.  As it is,
when you say based on the likelihood that the probability... it
seems you are confusing the subjective with the objective and, for me,
meaning goes out the door.


 MY PRIOR POST ...the descriptive length of sensations we receive...

 JEF'S PUT DOWN 2 Who is this we that receives sensations?  Holy
 homunculus, Batman, seems we have a bit of qualia confusion thrown into the
 mix!

 MY COMMENT Again I did not know that I would be attacked for using such
 a common English usage as we on this list.  Am I to assume that you, Jef,
 never use the words we or I because you are surrounded by friends so
 kind as to rudely say Holy homunculus, Batman every time you do.

Well, I meant to impart a humorous tone, rather than to be rude, but
again I offer my apology; I really should have known it wouldn't be
effective.

I highlighted this phrasing, not for the colloquial use of we, but
because it again demonstrates epistemic confusion impeding
comprehension of a machine intelligence interacting (and learning
from) its environment.  Too conceptualize any such system as
receiving sensation as opposed to expressing sensation, for
example, is wrong in systems-theoretic terms of stimulus, process,
response.  And this confusion, it seems to me, maps onto your
expressed difficulty grasping the significance of Solomonoff
induction.


 Or, just perhaps, are you a little more normal than that.

 In addition, the use of the word we or even I does not necessary imply a
 homunculus.  I think most modern understanding of the brain indicates that
 human consciousness is most probably -- although richly interconnected -- a
 distributed computation that does not require a homunculus.  I like and
 often use Bernard Baars' Theater of Consciousness metaphor.

Yikes!  Well, that goes to my point.  Any kind of Cartesian theater in
the mind, silent audience and all -- never mind the experimental
evidence for gaps, distortions, fabrications, confabulations in the
story putatively shown --  has no functional purpose.  In
systems-theoretical terms, this would entail an additional processing
step of extracting relevant information from the essentially whole
content of the theater which is not only unnecessary but intractable.
 The system interacts with 'reality' without the need to interpret it.


 But none of this means it is improper to use the words we or I when
 referring to ourselves or our consciousnesses.

I'm sincerely sorry to offend you.  It takes even more time to attempt
to repair, it impairs future relations, and clearly it didn't convey
any useful understanding -- evidenced by your perception that I was
criticizing your use of English.



 And I think one should be allowed to use the word sensation without being
 accused of qualia confusion.  Jeff, do you ever use the word sensation,
 or would that be too confusing for you?

Sensation is a perfectly good word and concept.  My point is that
sensation is never received by any system, that it smacks of qualia
confusion, and that such a misconception gets in the way of
understanding how a machine intelligence might deal with sensation
in practice.


 So, Jeff, if Solomonoff induction is really a concept that can help me get a
 more coherent model of reality, I would really appreciate someone who had
 the understanding, 

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:

 In my attempt to respond quickly I did not intended to attack him or
 his paper

Edward -

I never thought you were attacking me.

I certainly did attack some of your statements, but I never attacked you.

It's not my paper, just one that I recommended to the group as
relevant and worthwhile.

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63100811-68e37d


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Lukasz Stafiniak
On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:



 HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?

I will use the opportunity to advertise my equation extraction of
the Marcus Hutter UAI book.
And there is a section at the end about Juergen Schmidhuber's ideas,
from the older AGI'06 book. (Sorry biblio not generated yet.)

http://www.ii.uni.wroc.pl/~lukstafi/pmwiki/uploads/AGI/UAI.pdf

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244id_secret=63153472-64b600


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
memory of such states.  A significant percent of these viewer can all
respond at once in their own way to what is in the spot light of the
consciousness, and there is a mechanism for rapidly switching the
spotlight in response to audience reactions, reactions which can include
millions of dynamic dimension.

With regard to your statement that “The system interacts with 'reality'
without the need to interpret it.”  That sounds even more mind-denying
than Skinners Behaviorism.  At least Skinner showed enough respect for the
mind to honor it with a black box.  I guess we are to believe that
perception, cognition, planning, and understanding happen without any
interpretation.  They are all just direct look up.

Even Kolmogorov and Solomonoff at least accord it the honor of multiple
program, and ones that can be quite complex at that, complex enough to
even do “interpretation.”

Ed Porter


Edward W. Porter
Porter  Associates
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]



-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 4:22 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:

 Jeff,

 In your below flame you spent much more energy conveying contempt than
 knowledge.

I'll readily apologize again for the ineffectiveness of my presentation,
but I meant no contempt.


 Since I don't have time to respond to all of your attacks,

Not attacks, but (overly) terse pointers to areas highlighting difficulty
in understanding the problem due to difficulty framing the question.


 MY PRIOR POST ...affect the event's probability...

 JEF'S PUT DOWN 1More coherently, you might restate this as
 ...reflect the event's likelihood...

I (ineffectively) tried to highlight a thread of epistemic confusion
involving an abstract observer interacting with and learning from its
environment.  In your paragraph, I find it nearly impossible to find a
valid base from which to suggest improvements.  If I had acted more
wisely, I would have tried first to establish common ground
**outside** your statements and touched lightly and more constructively on
one or two points.


 MY COMMENT At Dragon System, then one of the world's leading
 speech recognition companies, I was repeatedly told by our in-house
 PhD in statistics that likelihood is the measure of a hypothesis
 matching, or being supported by, evidence.  Dragon selected speech
 recognition word candidates based on the likelihood that the
 probability distribution of their model matched the acoustic evidence
 provided by an event, i.e., a spoken utterance.

If you said Dragon selected word candidates based on their probability
distribution relative to the likelihood function supported by the evidence
provided by acoustic events I'd be with you there.  As it is, when you say
based on the likelihood that the probability... it seems you are
confusing the subjective with the objective and, for me, meaning goes out
the door.


 MY PRIOR POST ...the descriptive length of sensations we
 receive...

 JEF'S PUT DOWN 2 Who is this we that receives sensations?
 Holy homunculus, Batman, seems we have a bit of qualia confusion
 thrown into the mix!

 MY COMMENT Again I did not know that I would be attacked for using
 such a common English usage as we on this list.  Am I to assume that
 you, Jef, never use the words we or I because you are surrounded
 by friends so kind as to rudely say Holy homunculus, Batman every
 time you do.

Well, I meant to impart a humorous tone, rather than to be rude, but again
I offer my apology; I really should have known it wouldn't be effective.

I highlighted this phrasing, not for the colloquial use of we, but
because it again demonstrates epistemic confusion impeding comprehension
of a machine intelligence interacting (and learning
from) its environment.  Too conceptualize any such system as receiving
sensation as opposed to expressing sensation, for example, is wrong in
systems-theoretic terms of stimulus, process, response.  And this
confusion, it seems to me, maps onto your expressed difficulty grasping
the significance of Solomonoff induction.


 Or, just perhaps, are you a little more normal than that.

 In addition, the use of the word we or even I does not necessary
 imply a homunculus.  I think most modern understanding of the brain
 indicates that human consciousness is most probably -- although richly
 interconnected -- a distributed computation that does not require a
 homunculus.  I like and often use Bernard Baars' Theater of
 Consciousness metaphor.

Yikes!  Well, that goes to my point.  Any kind of Cartesian theater in the
mind, silent audience and all -- never mind the experimental evidence for
gaps, distortions, fabrications, confabulations in the story putatively
shown --  has no functional purpose.  In systems-theoretical terms, this
would

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Lukasz Stafiniak wrote in part on Thu 11/8/2007 11:54 AM


LUKASZ ## I think the main point is: Bayesian reasoning is about
conditional distributions, and Solomonoff / Hutter's work is about
conditional complexities. (Although directly taking conditional Kolmogorov
complexity didn't work, there is a paragraph about this in Hutter's
book.)

ED ## what is the value or advantage of conditional complexities
relative to conditional probabilities?

When you build a posterior over TMs from all that vision data using the
universal prior, you are looking for the simplest cause, you get the
probability of similar things, because similar things can be simply
transformed into the thing under question, moreover you get it summed with
the probability of things that are similar in the induced model space.

ED ## What’s a TM?

Also are you saying that the system would develop programs for matching
patterns, and then patterns for modifying those patterns, etc, So that
similar patterns would be matched by programs that called a routine for a
common pattern, but then other patterns to modify them to fit different
perceptions?

So are the programs just used for computing Kolmogorov complexity or are
they also used for generating and matching patterns.

Does it require that the programs exactly match a current pattern being
received, or does it know when a match is good enough that it can be
relied upon as having some significance?

Can the programs learn that similar but different patterns are different
views of the same thing?

Can they learn a generalizational and compositional hierarchy of patterns?

Can they run on massively parallel processing.

The Hutters expectimax tree appears to alternate levels of selection and
evaluation.   Can the Expectimax tree run in reverse and in parallel, with
information coming up from low sensory levels, and then being selected
based on their relative probability, and then having the selected lower
level patterns being fed as inputs into higher level patterns and then
repeating that process.  That would be a hierarchy that alternates
matching and then selecting the best scoring match at alternate levels of
the hierarchy as is shown in the Serre article I have cited so many times
before on this list.


LUKASZ ## You scared me... Check again, it's like in Solomon the
king.
ED## Thanks for the correction.  After I sent the email I realized
the mistake, but I was too stupid to parse it as “Solomon-(the King)-off.
I was stuck in thinking of “Sol-om-on-on-off”, which is hard to remember
and that is why I confused it.  “Solomon-(the King)-off” is much easier to
remember. I have always been really bad at names, foreign languages, and
particularly spelling.

LUKASZ ## Yes, it is all about non-literal similarity matching, like
you said in later post, finding a library that makes for very short codes
for a class of similar things.

ED## are these short codes sort of like Wolfram little codelettes,
that can hopefully represent complex patterns out of very little code, or
do they pretty much represent subsets of visual patterns as small bit
maps.

Ed Porter



-Original Message-
From: Lukasz Stafiniak [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:54 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter [EMAIL PROTECTED] wrote:



 VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID

 VLADIMIR Hutter shows that prior can be selected rather
 VLADIMIR arbitrarily
 without giving up too much

BTW: There is a point in Hutter's book that I don't fully understand: the
belief contamination theorem. Is the contamination reintroduced at each
cycle in this theorem? (The only way it makes sense.)

 (However, I have read that for complex probability distributions the
 choice of the class of mathematical model you use to model the
 distribution is part of the prior choosing issue, and can be important
 — but that did not seem to be addressed in the Solomonoff Induction
 paper.  For example in some speech recognition each of the each speech
 frame model has a pre-selected number of dimensions, such as FFT bins
 (or related signal processing derivatives), and each dimension is not
 represented by a Gausian but rather by a basis function comprised of a
 set of a selected number of Gausians.)

Yes. The choice of Solomonoff and Hutter is to take a distribution over
all computable things.

 It seems to me that when you don't have much frequency data, we humans
 normally make a guess based on the probability of similar things, as
 suggested in the Kemp paper I cited.It seems to me that is by far
the
 most commonsensical approach.  In fact, due to the virtual
 omnipreseance of non-literal similarity in everything we see and hear,
 (e.g., the same face virtually never hits V1 exactly the same) most of
 our probabilistic thinking is dominated by similarity derived
 probabilities.

I think