Re: AIXItl; Wolfram's hypothesis (was Re: [agi] How valuable is Solmononoff Induction for real world AGI?)

2007-11-10 Thread Lukasz Stafiniak
On Nov 10, 2007 4:47 PM, Tim Freeman <[EMAIL PROTECTED]> wrote:
> From: "Lukasz Stafiniak" <[EMAIL PROTECTED]>
> >The programs are generally required to exactly match in AIXI (but not
> >in AIXItl I think).
>
> I'm pretty sure AIXItl wants an exact match too.  There isn't anything
> there that lets the theoretical AI guess probability distributions and
> then get scored based on how probable the actual world is according to
> that distribution -- each hypothesis is either right or wrong, and
> wrong hypotheses are discarded.
>
I agree that I misinterpreted the meaning of "exact match".
AIXItl uses strategies whose outputs do not need to agree with history.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63846012-5d1170


AIXItl; Wolfram's hypothesis (was Re: [agi] How valuable is Solmononoff Induction for real world AGI?)

2007-11-10 Thread Tim Freeman
From: "Lukasz Stafiniak" <[EMAIL PROTECTED]>
>The programs are generally required to exactly match in AIXI (but not
>in AIXItl I think).

I'm pretty sure AIXItl wants an exact match too.  There isn't anything
there that lets the theoretical AI guess probability distributions and
then get scored based on how probable the actual world is according to
that distribution -- each hypothesis is either right or wrong, and
wrong hypotheses are discarded.

The reference I use for AIXItl is:

http://www.hutter1.net/ai/aixigentle.htm

On Nov 9, 2007 5:26 AM, Edward W. Porter <[EMAIL PROTECTED]> wrote:
> are these short codes sort of like Wolfram little codelettes,
> that can hopefully represent complex patterns out of very little code, or do
> they pretty much represent subsets of visual patterns as small bit maps.

From: "Lukasz Stafiniak" <[EMAIL PROTECTED]>
>It depends on reality, whether the reality supports Wolfram's hypothesis.

I'm guessing you mean the "Priniciple of Computational Equivalence",
as defined at:

   http://mathworld.wolfram.com/PrincipleofComputationalEquivalence.html

He's saying that 'systems found in the natural world can perform
computations up to a maximal ("universal") level of computational
power'.  All the AIXI family needs to be near-optimal is for the
probability distribution of possible outcomes to be computable.  I
couldn't quickly tell whether Wolfram is saying that the actual
outcomes are computable, or just the probabilities of the outcomes.

-- 
Tim Freeman   http://www.fungible.com   [EMAIL PROTECTED]

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63844010-625f39


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Edward W. Porter
Jef,

The following is a direct cut and paste from you Fri 11/9/2007 2:46 PM
post to which I was responding in my Fri 11/9/2007 5:26 PM post, which you
unjustly flame in the email below.

==start of cut and paste from Jef’s email===
> > MY COMMENT>>>> At Dragon System, then one of the world's leading
> > speech recognition companies, I was repeatedly told by our in-house
> > PhD in statistics that "likelihood" is the measure of a hypothesis
> > matching, or being supported by, evidence.  Dragon selected speech
> > recognition word candidates based on the likelihood that the
> > probability distribution of their model matched the acoustic
> > evidence provided by an event, i.e., a spoken utterance.
>
> If you said Dragon selected word candidates based on their probability
> distribution relative to the likelihood function supported by the
> evidence provided by acoustic events I'd be with you there.  As it is,
> when you say "based on the likelihood that the probability..." it
> seems you are confusing the subjective with the objective and, for me,
> meaning goes out the door.


Edward,  can you explain what you might have meant by "based on the
likelihood that the probability..."?
==end of cut and paste from Jef’s email===

The last sentence in this cut and paste is the one you have just denied
making in the below email, word for word, character for character.  This
stuff is on the list so other people can check it.

So if you are going to make such obviously and easily provably falsehoods
and go into a tizzy again, I guess there really is no point continuing
this conversation.

Ed Porter

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Friday, November 09, 2007 5:42 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/9/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> >JEF ##> Edward,  can you explain what you might have meant by
> >"based on
> the likelihood that the probability..."?
>
> ED ##> I think my statement --  "Dragon selected speech
> recognition word candidates based on the likelihood that the
> probability distribution of their model matched the acoustic evidence"
> -- maps directly into your statement that -- "likelihood is simply the
> probability of some data."

How bizarre. Clearly that's not what I said, and I won't waste any more
time on this.



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63707189-2c5605

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Jef Allbright
On 11/9/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> >JEF ##> Edward,  can you explain what you might have meant by "based on
> the likelihood that the probability..."?
>
> ED ##> I think my statement --  "Dragon selected speech recognition word
> candidates based on the likelihood that the probability distribution of
> their model matched the acoustic evidence" -- maps directly into your
> statement that -- "likelihood is simply the probability of some data."

How bizarre. Clearly that's not what I said, and I won't waste any
more time on this.



-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63675964-fcf666


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Edward W. Porter
Jeff,

(to make it easier to know who is responding to whom, if any of this is
cut into postings by others I have inserted a “>” before “JEF ##>” to
indicate his comments occurred first in time.)

>JEF ##> Edward,  can you explain what you might have meant by "based
on the likelihood that the probability..."?

ED ##> I think my statement --  “Dragon selected speech recognition
word candidates based on the likelihood that the probability distribution
of their model matched the acoustic evidence” -- maps directly into your
statement that -- “likelihood is simply the probability of some data.”

The "probability of some data" given a model’s probability distribution
can, I think, be properly considered a match between the distribution of
the data and the distribution of the model.  Maybe in Jef-speak that is
not a proper use of the word “match”, but I think in normal parlance, even
in computer science it is.  Correct me if I am wrong.

Remember at Dragon we were scoring multiple different word models’
probability distributions against the acoustic data, and those scores were
considered to indicate a degree of match between the model and the data.

>Jef ##> "’Given all the relevant parameters’ is key, and implies
objectivity.  Without all the relevant parameters of the likelihood
function, you are left with probability, which is inherently subjective.
When you said "based on the likelihood that the probability", it seemed
that you were somehow (?) confusing the subjective with the objective,
which in my opinion, is a theme running through this entire thread.”

ED ##> According to the above statement all likelihood functions that
are computable are subjective, and thus according to your definition just
“probabilities.”  This is because it is impossible for a computable
likelihood function to include all possibly relevant parameters.  No
computable system knows enough about the world to know what the relevant
parameters are.  There always could be an, as yet, un-modeled glitch in
the Matrix.  Thus, your implication that I had somehow confused the
correct definition of likelihood, which would have it be “objective”, with
one that was subjective (because it did not use all relevant parameters),
would seem to be a crime committed by any person who has ever talked about
the actual likelihood calculations (which would include a majority of the
people in the field).

Again my offense seems to be using words as most in the field do, rather
than in strick adherence to Jef-speak.

>Jef ##> How does this map onto your difficulty grasping the
significance of Solomonoff induction? Solomonoff induction is an idealized
description of learning by a subjective agent interacting with an
"objective"  (actually "consistent" might be more accurate here) reality.

ED ##>  Finally, I am learning what our whole back and forth has been
about.  I wish our correspondence had included more sentences like this
earlier on.

But if I am guilty of using likelihoods in a way that sullies them by
making them subjective, how does that make them any worse than Solomonoff
induction?  According to the above isn’t it is guilty of the same lack of
purity because it is describing learning by a “subjective” agent.

Or are you are claiming Solomonoff induction is an objective description
of a subjective thing?  Words are often stretched so far (although I
thought not in Jef-speak).

But if Solomonoff induction is based on generalizations assuming knowledge
about things it can never know, how is it that any less “subjective” than
a likelihood function calculated without all relevant parameters?  Does
pretending we know everything about reality make our understanding of it
any less subjective?

Pretending can allow some useful thought experiments, but are they
objective?

Are mathematical proofs objective?.  How do we know they are based on all
the relevant parameters?

Isn't math just a creation in our heads, and thus subjective?  Yes,
scientific evidence suggests it describes some real things in the real
world, really well, but that is all based on sensation, and that,
according to you is subjective.


Ed Porter

P.S. Since your hobby is collecting paradoxes, if you have a few that are
either particularly insightful or amusing (and hopefully only a sentence
or two long), please feel free to share.

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Friday, November 09, 2007 2:46 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> ED >> Most importantly you say my alleged confusion between
> subjective and objective maps into my difficulty to grasp the
> significance of Solomonoff induction. If you could do so, please
> explain what you mean.

Given our signific

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Jef Allbright
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> ED >> Most importantly you say my alleged confusion between
> subjective and objective maps into my difficulty to grasp the significance
> of Solomonoff induction. If you could do so, please explain what you mean.

Given our significantly disjoint backgrounds, the best I hoped for was
to point out where you're not going to get a good answer because
you're not asking a good question.  In contrast to my "unfriendly"
negative approach, there's plenty of positive literature available on
the web, and the more you poke at, the more it'll tend to fit into
place.


> I would really like to better understand why so many smart people seem the
> think its the bee's knees.

I wouldn't call it the "bee's knees" because it doesn't actually tell
us how to build a practical machine intelligence, but it does tell us
about the nature of the problem, and practical efforts must be
consistent with this theory.

This discussion seems very similar to a previous futile discussion
over the significance of the Principle of Indifference to
probabilistic inference.  When will I learn?  ;-)


> You say "'sensation is never received' by any system" and yet the word is
> commonly used to describe information received by the brain from sensory
> organs, not just in common parlance but also in brain science literature.

The sensory organs may act to transduce, filter, encode and
participate in the transfer of stimuli, but stimulus is not sensation.

>  I don't think your strictly limited usage is the most common.

That's my frequent burden, that what I find most interesting tends to
be the least common, and thus generally misunderstood.  But that's why
I find it interesting.  As a hobby, I collect paradoxes.





> > MY COMMENT At Dragon System, then one of the world's leading
> > speech recognition companies, I was repeatedly told by our in-house
> > PhD in statistics that "likelihood" is the measure of a hypothesis
> > matching, or being supported by, evidence.  Dragon selected speech
> > recognition word candidates based on the likelihood that the
> > probability distribution of their model matched the acoustic evidence
> > provided by an event, i.e., a spoken utterance.
>
> If you said Dragon selected word candidates based on their probability
> distribution relative to the likelihood function supported by the evidence
> provided by acoustic events I'd be with you there.  As it is, when you say
> "based on the likelihood that the probability..." it seems you are confusing
> the subjective with the objective and, for me, meaning goes out the door.


Edward,  can you explain what you might have meant by "based on the
likelihood that the probability..."?


To expand on my previous response, likelihood is simply the
probability of some data, given all the relevant parameters.  "Given
all the relevant parameters" is key, and implies objectivity.  Without
all the relevant parameters of the likelihood function, you are left
with probability, which is inherently subjective.  When you said
"based on the likelihood that the probability", it seemed that you
were somehow (?) confusing the subjective with the objective, which in
my opinion, is a theme running through this entire thread.

How does this map onto your difficulty grasping the significance of
Solomonoff induction?
Solomonoff induction is an idealized description of learning by a
subjective agent interacting with an "objective"  (actually
"consistent" might be more accurate here) reality.

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63580789-28c095


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Lukasz Stafiniak
On Nov 9, 2007 5:26 AM, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
> So are the programs just used for computing Kolmogorov complexity or are
> they also used for generating and matching patterns.

The programs do not compute K complexity, they (their length) _are_ (a
variant of) Kolmogorov complexity. The programs compute (predict) the
environment.
>
> Does it require that the programs exactly match a current pattern being
> received, or does it know when a match is good enough that it can be relied
> upon as having some significance?
>
The programs are generally required to exactly match in AIXI (but not
in AIXItl I think). But the "significance" is provided by the
compression on representation of similar things, which favors the same
sort of similarity in the future.

> Can they run on massively parallel processing.

I think they can... In AIXI, you would build a summation tree for the
posterior probability.
>
> The Hutters expectimax tree appears to alternate levels of selection and
> evaluation.   Can the Expectimax tree run in reverse and in parallel, with
> information coming up from low sensory levels, and then being selected based
> on their relative probability, and then having the selected lower level
> patterns being fed as inputs into higher level patterns and then repeating
> that process.  That would be a hierarchy that alternates matching and then
> selecting the best scoring match at alternate levels of the hierarchy as is
> shown in the Serre article I have cited so many times before on this list.
>
To be optimal, the expectimax must be performed chronologically from
the end of the horizon (dynamic programming principle: close to the
end of the time horizon, you have smaller planning problems -- less
opportunities; from smaller solutions to smaller problems you build
bigger solutions backwards in time). But the probabilities are
conditional on all current history including "low sensory levels".

(Generally, your comment above doesn't make much sense in the AIXI context.)
>
> ED##>> are these short codes sort of like Wolfram little codelettes,
> that can hopefully represent complex patterns out of very little code, or do
> they pretty much represent subsets of visual patterns as small bit maps.
>
It depends on reality, whether the reality supports Wolfram's hypothesis.

Best Regards.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63539823-b308a9


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Edward W. Porter
Thank you for your reply.  I want to take some time and compare this with
the reply I got from Shane Legg and get back to you when I have more time
to think about it.

Edward W. Porter
Porter & Associates
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]



-Original Message-
From: Lukasz Stafiniak [mailto:[EMAIL PROTECTED]
Sent: Friday, November 09, 2007 7:13 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On Nov 9, 2007 5:26 AM, Edward W. Porter <[EMAIL PROTECTED]> wrote:
> ED ##>> what is the value or advantage of conditional complexities
> relative to conditional probabilities?
>
Kolmogorov complexity is "universal". For probabilities, you need to
specify the probability space and initial distribution over this space.
>
> ED ##>> What's a TM?
(Turing Machine, or a code for a universal Turing Machine = a program...)
>
> Also are you saying that the system would develop programs for
> matching patterns, and then patterns for modifying those patterns,
> etc, So that similar patterns would be matched by programs that called
> a routine for a common pattern, but then other patterns to modify them
> to fit different perceptions?
>
Yes, these programs will be compact description od data when enough data
gets collected, so their (posterior) probability will grow with time. But
the most probable programs will be very cryptic, without redundancy to
make the structure evident.

> So are the programs just used for computing Kolmogorov complexity or
> are they also used for generating and matching patterns.
>
It is difficult to say: in AIXI, the direct operation is governed by the
expectimax algorithm, but the algorithm works "in future" (is derived from
the Solomonoff predictor). Hutter mentions alternative model AIXI_alt,
which models actions the same way as the environment...

> Does it require that the programs exactly match a current pattern
> being received, or does it know when a match is good enough that it
> can be relied upon as having some significance?
>
It is automatic: when you have a program with a good enough match, then
you can "parameterize" it over the difference and apply twice, thus saving
the code. Remember that the programs need to represent the whole history.

> Can the programs learn that similar but different patterns are
> different views of the same thing? Can they learn a generalizational
> and compositional hierarchy of patterns?

With an egzegetic enough interpretation...

I will comment on further questions in a few hours.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63505719-0476eb


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-09 Thread Lukasz Stafiniak
On Nov 9, 2007 5:26 AM, Edward W. Porter <[EMAIL PROTECTED]> wrote:
> ED ##>> what is the value or advantage of conditional complexities
> relative to conditional probabilities?
>
Kolmogorov complexity is "universal". For probabilities, you need to
specify the probability space and initial distribution over this
space.
>
> ED ##>> What's a TM?
(Turing Machine, or a code for a universal Turing Machine = a program...)
>
> Also are you saying that the system would develop programs for matching
> patterns, and then patterns for modifying those patterns, etc, So that
> similar patterns would be matched by programs that called a routine for a
> common pattern, but then other patterns to modify them to fit different
> perceptions?
>
Yes, these programs will be compact description od data when enough
data gets collected, so their (posterior) probability will grow with
time. But the most probable programs will be very cryptic, without
redundancy to make the structure evident.

> So are the programs just used for computing Kolmogorov complexity or are
> they also used for generating and matching patterns.
>
It is difficult to say: in AIXI, the direct operation is governed by
the expectimax algorithm, but the algorithm works "in future" (is
derived from the Solomonoff predictor). Hutter mentions alternative
model AIXI_alt, which models actions the same way as the
environment...

> Does it require that the programs exactly match a current pattern being
> received, or does it know when a match is good enough that it can be relied
> upon as having some significance?
>
It is automatic: when you have a program with a good enough match,
then you can "parameterize" it over the difference and apply twice,
thus saving the code. Remember that the programs need to represent the
whole history.

> Can the programs learn that similar but different patterns are different
> views of the same thing?
> Can they learn a generalizational and compositional hierarchy of patterns?

With an egzegetic enough interpretation...

I will comment on further questions in a few hours.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63453551-e3704c


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Lukasz Stafiniak wrote in part on Thu 11/8/2007 11:54 AM


LUKASZ ##>> I think the main point is: Bayesian reasoning is about
conditional distributions, and Solomonoff / Hutter's work is about
conditional complexities. (Although directly taking conditional Kolmogorov
complexity didn't work, there is a paragraph about this in Hutter's
book.)

ED ##>> what is the value or advantage of conditional complexities
relative to conditional probabilities?

When you build a posterior over TMs from all that vision data using the
universal prior, you are looking for the simplest cause, you get "the
probability of similar things", because similar things can be simply
transformed into the thing under question, moreover you get it summed with
the probability of things that are similar in the induced model space.

ED ##>> What’s a TM?

Also are you saying that the system would develop programs for matching
patterns, and then patterns for modifying those patterns, etc, So that
similar patterns would be matched by programs that called a routine for a
common pattern, but then other patterns to modify them to fit different
perceptions?

So are the programs just used for computing Kolmogorov complexity or are
they also used for generating and matching patterns.

Does it require that the programs exactly match a current pattern being
received, or does it know when a match is good enough that it can be
relied upon as having some significance?

Can the programs learn that similar but different patterns are different
views of the same thing?

Can they learn a generalizational and compositional hierarchy of patterns?

Can they run on massively parallel processing.

The Hutters expectimax tree appears to alternate levels of selection and
evaluation.   Can the Expectimax tree run in reverse and in parallel, with
information coming up from low sensory levels, and then being selected
based on their relative probability, and then having the selected lower
level patterns being fed as inputs into higher level patterns and then
repeating that process.  That would be a hierarchy that alternates
matching and then selecting the best scoring match at alternate levels of
the hierarchy as is shown in the Serre article I have cited so many times
before on this list.


LUKASZ ##>> You scared me... Check again, it's like in Solomon the
king.
ED##>> Thanks for the correction.  After I sent the email I realized
the mistake, but I was too stupid to parse it as “Solomon-(the King)-off.
I was stuck in thinking of “Sol-om-on-on-off”, which is hard to remember
and that is why I confused it.  “Solomon-(the King)-off” is much easier to
remember. I have always been really bad at names, foreign languages, and
particularly spelling.

LUKASZ ##>> Yes, it is all about non-literal similarity matching, like
you said in later post, finding a library that makes for very short codes
for a class of similar things.

ED##>> are these short codes sort of like Wolfram little codelettes,
that can hopefully represent complex patterns out of very little code, or
do they pretty much represent subsets of visual patterns as small bit
maps.

Ed Porter



-Original Message-
From: Lukasz Stafiniak [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:54 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
>
>
> VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID
>
> VLADIMIR>>>> "Hutter shows that prior can be selected rather
> VLADIMIR>>>> arbitrarily
> without giving up too much"

BTW: There is a point in Hutter's book that I don't fully understand: the
belief contamination theorem. Is the contamination reintroduced at each
cycle in this theorem? (The only way it makes sense.)
>
> (However, I have read that for complex probability distributions the
> choice of the class of mathematical model you use to model the
> distribution is part of the prior choosing issue, and can be important
> — but that did not seem to be addressed in the Solomonoff Induction
> paper.  For example in some speech recognition each of the each speech
> frame model has a pre-selected number of dimensions, such as FFT bins
> (or related signal processing derivatives), and each dimension is not
> represented by a Gausian but rather by a basis function comprised of a
> set of a selected number of Gausians.)

Yes. The choice of Solomonoff and Hutter is to take a distribution over
all computable things.
>
> It seems to me that when you don't have much frequency data, we humans
> normally make a guess based on the probability of similar things, as
> suggested in the Kemp paper I cited.It seems to me that is by far
the
> most commonsensical approach.  In fact, 

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
ng to the mind’s
own dynamic activation state, including the short term and long term
memory of such states.  A significant percent of these viewer can all
respond at once in their own way to what is in the spot light of the
consciousness, and there is a mechanism for rapidly switching the
spotlight in response to audience reactions, reactions which can include
millions of dynamic dimension.

With regard to your statement that “The system interacts with 'reality'
without the need to interpret it.”  That sounds even more mind-denying
than Skinners Behaviorism.  At least Skinner showed enough respect for the
mind to honor it with a black box.  I guess we are to believe that
perception, cognition, planning, and understanding happen without any
interpretation.  They are all just direct look up.

Even Kolmogorov and Solomonoff at least accord it the honor of multiple
program, and ones that can be quite complex at that, complex enough to
even do “interpretation.”

Ed Porter


Edward W. Porter
Porter & Associates
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]



-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 4:22 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> Jeff,
>
> In your below flame you spent much more energy conveying contempt than
> knowledge.

I'll readily apologize again for the ineffectiveness of my presentation,
but I meant no contempt.


> Since I don't have time to respond to all of your attacks,

Not attacks, but (overly) terse pointers to areas highlighting difficulty
in understanding the problem due to difficulty framing the question.


> MY PRIOR POST>>>> "...affect the event's probability..."
>
> JEF'S PUT DOWN 1>>>>More coherently, you might restate this as
> "...reflect the event's likelihood..."

I (ineffectively) tried to highlight a thread of epistemic confusion
involving an abstract observer interacting with and learning from its
environment.  In your paragraph, I find it nearly impossible to find a
valid base from which to suggest improvements.  If I had acted more
wisely, I would have tried first to establish common ground
**outside** your statements and touched lightly and more constructively on
one or two points.


> MY COMMENT>>>> At Dragon System, then one of the world's leading
> speech recognition companies, I was repeatedly told by our in-house
> PhD in statistics that "likelihood" is the measure of a hypothesis
> matching, or being supported by, evidence.  Dragon selected speech
> recognition word candidates based on the likelihood that the
> probability distribution of their model matched the acoustic evidence
> provided by an event, i.e., a spoken utterance.

If you said Dragon selected word candidates based on their probability
distribution relative to the likelihood function supported by the evidence
provided by acoustic events I'd be with you there.  As it is, when you say
"based on the likelihood that the probability..." it seems you are
confusing the subjective with the objective and, for me, meaning goes out
the door.


> MY PRIOR POST>>>> "...the descriptive length of sensations we
> receive..."
>
> JEF'S PUT DOWN 2>>>> Who is this "we" that "receives" sensations?
> Holy homunculus, Batman, seems we have a bit of qualia confusion
> thrown into the mix!
>
> MY COMMENT>>>> Again I did not know that I would be attacked for using
> such a common English usage as "we" on this list.  Am I to assume that
> you, Jef, never use the words "we" or "I" because you are surrounded
> by "friends" so kind as to rudely say "Holy homunculus, Batman" every
> time you do.

Well, I meant to impart a humorous tone, rather than to be rude, but again
I offer my apology; I really should have known it wouldn't be effective.

I highlighted this phrasing, not for the colloquial use of "we", but
because it again demonstrates epistemic confusion impeding comprehension
of a machine intelligence interacting (and learning
from) its environment.  Too conceptualize any such system as "receiving
sensation" as opposed to "expressing sensation", for example, is wrong in
systems-theoretic terms of stimulus, process, response.  And this
confusion, it seems to me, maps onto your expressed difficulty grasping
the significance of Solomonoff induction.


> Or, just perhaps, are you a little more normal than that.
>
> In addition, the use of the word "we" or even "I" does not necessary
> imply a hom

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, Jef Allbright <[EMAIL PROTECTED]> wrote:
> I'm sorry I'm not going to be able to provide much illumination for
> you at this time.  Just the few sentences of yours quoted above, while
> of a level of comprehension equal or better than average on this list,
> demonstrate epistemological incoherence to the extent I would hardly
> know where to begin.
>
> This discussion reminds me of hot rod enthusiasts arguing passionately
> about how to build the best racing car, while denigrating any
> discussion of entropy as outside the "practical."
>

You are over stating the case majorly. Entropy can be used to make
predictions about chemical reactions and help design systems. UAI so
far has yet to prove its usefulness. It is just a mathematical
formalism that is incomplete in a number of ways.

1) Doesn't treat computation as outputting to the environment, thus
can have no concept of saving energy or avoiding inteference with
other systems by avoiding computation. The lack of energy saving means
it is not valid model for solving the problem of being a
non-reversible intelligence in an energy poor environment (which
humans are and most mobile robots will be).

2) It is based on Sequential Interaction Machines, rather than
Multi-Stream Interaction Machines, which means it might lose out on
expressiveness as talked about here.

http://www.cs.brown.edu/people/pw/papers/bcj1.pdf

It is the first step on an interesting path, but it is too divorced
from what computation actually is, for me to consider it equivalent to
the entropy of AI.

 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63155358-fb9c39


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Lukasz Stafiniak
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
>
>
> HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?
>
I will use the opportunity to advertise my "equation extraction" of
the Marcus Hutter UAI book.
And there is a section at the end about Juergen Schmidhuber's ideas,
from the older AGI'06 book. (Sorry biblio not generated yet.)

http://www.ii.uni.wroc.pl/~lukstafi/pmwiki/uploads/AGI/UAI.pdf

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63153472-64b600


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> In my attempt to respond quickly I did not intended to attack him or
> his paper

Edward -

I never thought you were attacking me.

I certainly did "attack" some of your statements, but I never attacked you.

It's not my paper, just one that I recommended to the group as
relevant and worthwhile.

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63100811-68e37d


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:

> Jeff,
>
> In your below flame you spent much more energy conveying contempt than
> knowledge.

I'll readily apologize again for the ineffectiveness of my
presentation, but I meant no contempt.


> Since I don't have time to respond to all of your attacks,

Not attacks, but (overly) terse pointers to areas highlighting
difficulty in understanding the problem due to difficulty framing the
question.


> MY PRIOR POST "...affect the event's probability..."
>
> JEF'S PUT DOWN 1More coherently, you might restate this as "...reflect
> the event's likelihood..."

I (ineffectively) tried to highlight a thread of epistemic confusion
involving an abstract observer interacting with and learning from its
environment.  In your paragraph, I find it nearly impossible to find a
valid base from which to suggest improvements.  If I had acted more
wisely, I would have tried first to establish common ground
**outside** your statements and touched lightly and more
constructively on one or two points.


> MY COMMENT At Dragon System, then one of the world's leading speech
> recognition companies, I was repeatedly told by our in-house PhD in
> statistics that "likelihood" is the measure of a hypothesis matching, or
> being supported by, evidence.  Dragon selected speech recognition word
> candidates based on the likelihood that the probability distribution of
> their model matched the acoustic evidence provided by an event, i.e., a
> spoken utterance.

If you said Dragon selected word candidates based on their probability
distribution relative to the likelihood function supported by the
evidence provided by acoustic events I'd be with you there.  As it is,
when you say "based on the likelihood that the probability..." it
seems you are confusing the subjective with the objective and, for me,
meaning goes out the door.


> MY PRIOR POST "...the descriptive length of sensations we receive..."
>
> JEF'S PUT DOWN 2 Who is this "we" that "receives" sensations?  Holy
> homunculus, Batman, seems we have a bit of qualia confusion thrown into the
> mix!
>
> MY COMMENT Again I did not know that I would be attacked for using such
> a common English usage as "we" on this list.  Am I to assume that you, Jef,
> never use the words "we" or "I" because you are surrounded by "friends" so
> kind as to rudely say "Holy homunculus, Batman" every time you do.

Well, I meant to impart a humorous tone, rather than to be rude, but
again I offer my apology; I really should have known it wouldn't be
effective.

I highlighted this phrasing, not for the colloquial use of "we", but
because it again demonstrates epistemic confusion impeding
comprehension of a machine intelligence interacting (and learning
from) its environment.  Too conceptualize any such system as
"receiving sensation" as opposed to "expressing sensation", for
example, is wrong in systems-theoretic terms of stimulus, process,
response.  And this confusion, it seems to me, maps onto your
expressed difficulty grasping the significance of Solomonoff
induction.


> Or, just perhaps, are you a little more normal than that.
>
> In addition, the use of the word "we" or even "I" does not necessary imply a
> homunculus.  I think most modern understanding of the brain indicates that
> human consciousness is most probably -- although richly interconnected -- a
> distributed computation that does not require a homunculus.  I like and
> often use Bernard Baars' Theater of Consciousness metaphor.

Yikes!  Well, that goes to my point.  Any kind of Cartesian theater in
the mind, silent audience and all -- never mind the experimental
evidence for gaps, distortions, fabrications, confabulations in the
story putatively shown --  has no functional purpose.  In
systems-theoretical terms, this would entail an additional processing
step of extracting relevant information from the essentially whole
content of the theater which is not only unnecessary but intractable.
 The system interacts with 'reality' without the need to interpret it.


> But none of this means it is improper to use the words "we" or "I" when
> referring to ourselves or our consciousnesses.

I'm sincerely sorry to offend you.  It takes even more time to attempt
to repair, it impairs future relations, and clearly it didn't convey
any useful understanding -- evidenced by your perception that I was
criticizing your use of English.



> And I think one should be allowed to use the word "sensation" without being
> accused of "qualia confusion."  Jeff, do you ever use the word "sensation,"
> or would that be too "confusing" for you?

"Sensation" is a perfectly good word and concept.  My point is that
sensation is never "received" by any system, that it smacks of qualia
confusion, and that such a misconception gets in the way of
understanding how a machine intelligence might deal with "sensation"
in practice.


> So, Jeff, if Solomonoff induction is really a concept that c

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Derek,



Thank you.



I think the list should be a place where people can debate and criticize
ideas, but I think such poorly reasoned and insulting flames like Jefs are
not helpful, particularly if they are driving potentially valuable
contributors like you off the list.



Luckily such flames are relatively rare.  So I think we should criticize
such flames when they do occur so there are fewer of them and try to have
tougher skin so that when they occur we don’t get too upset by them (and
so that we don’t ourselves get drawn into flame mode by tough but fair
criticisms).



I just re-read my email that sent Jef into such a tizzy.  Although it was
not hostile, it was not as tactful as it could have been.  In my attempt
to respond quickly I did not intended to attack him or his paper (other
than one apparent mis-statement in it), but rather to say it didn’t relate
to the issue I was specifically interested in.  I actually thought it was
quite an interesting article.  I wish in hindsight I had said so.  At the
end of the post that upset Jef I actually asked how the Kolmogorov
complexity measure his paper discussed might be applied to a given type of
AGI.  That was my attempt to acknowledge the importance of what the paper
dealt with.



Lets just hope going forward most people can take fair attempts to debate,
question, or attack their ideas without flipping out, and I guess we all
should spend an extra 5% more time in our posts trying to be tactful.



And lets just hope more people like you start contributing again.



Ed Porter




-Original Message-
From: Derek Zahn [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 3:05 PM
To: agi@v2.listbox.com
Subject: RE: [agi] How valuable is Solmononoff Induction for real world
AGI?


Edward,

For some reason, this list has become one of the most hostile and
poisonous discussion forums around.  I admire your determined effort to
hold substantive conversations here, and hope you continue.  Many of us
have simply given up.



  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
<http://v2.listbox.com/member/?&;
> &

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63067563-8aefd0

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Derek Zahn
Edward,
 
For some reason, this list has become one of the most hostile and poisonous 
discussion forums around.  I admire your determined effort to hold substantive 
conversations here, and hope you continue.  Many of us have simply given up.

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63054363-e5048c

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Jeff,

In your below flame you spent much more energy conveying contempt than
knowledge.  Since I don’t have time to respond to all of your attacks, let
us, for example, just look at the last two:

MY PRIOR POST>>>> "...affect the event's probability..."

JEF’S PUT DOWN 1>>>>More coherently, you might restate this as "...reflect
the event's likelihood..."

MY COMMENT>>>> At Dragon System, then one of the world’s leading speech
recognition companies, I was repeatedly told by our in-house PhD in
statistics that “likelihood” is the measure of a hypothesis matching, or
being supported by, evidence.  Dragon selected speech recognition word
candidates based on the likelihood that the probability distribution of
their model matched the acoustic evidence provided by an event, i.e., a
spoken utterance.

Similarly, if one is drawing balls from a bag that has a distribution of
black and white balls, the color of a ball produced by a given random
drawing from the bag has a probability based on the distribution in the
bag.  But one uses the likelihood function to calculate the probability
distribution in the bag, from among multiple possible distribution
hypothesis, by how well that distribution matches data produced by events.


The creation of the “event” by external reality I was referring to is much
more like the utterance of a word (and its observation) or the drawing of
ball (and the observation of its color) from a bag than a determination of
what probability distribution most likely created it.  On the otherhand,
trying to understand the complexity that gives rise to such an event would
be more like using likelihoods.

When I wrote the post you have flamed I was not worrying about trying to
be exact in my memo, because I had no idea people on this list wasted
their energy correcting such common inexact usages as switching
“probability” and “likelihood.”  But as it turns out in this case my
selection of the word “probability” rather than “likelihood” seems to be
totally correct.


MY PRIOR POST>>>> "...the descriptive length of sensations we receive..."

JEF’S PUT DOWN 2>>>> Who is this "we" that "receives" sensations?  Holy
homunculus, Batman, seems we have a bit of qualia confusion thrown into
the mix!

MY COMMENT>>>> Again I did not know that I would be attacked for using
such a common English usage as “we” on this list.  Am I to assume that
you, Jef, never use the words “we” or “I” because you are surrounded by
“friends” so kind as to rudely say “Holy homunculus, Batman” every time
you do.

Or, just perhaps, are you a little more normal than that.

In addition, the use of the word “we” or even “I” does not necessary imply
a homunculus.  I think most modern understanding of the brain indicates
that human consciousness is most probably -- although richly
interconnected -- a distributed computation that does not require a
homunculus.  I like and often use Bernard Baars’ Theater of Consciousness
metaphor.

But none of this means it is improper to use the words “we” or “I” when
referring to ourselves or our consciousnesses.

And I think one should be allowed to use the word “sensation” without
being accused of “qualia confusion.”  Jeff, do you ever use the word
“sensation,” or would that be too “confusing” for you?


So, Jeff, if Solomonoff induction is really a concept that can help me get
a more coherent model of reality, I would really appreciate someone who
had the understanding, intelligence, and friendliness to at least try in
relatively simple words to give me pointers as to how and why it is so
important, rather someone who picks apart every word I say with minute or
often incorrect criticisms.

A good example of they type of friendly effort I appreciate is in Lukasz
Stafiniak 11/08/07 11:54 AM post to me, which I have not had time yet to
fully understand, but which I greatly appreciate for its focused and
helpful approach.

Ed Porter

-Original Message-----
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 12:55 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
> Jef,
>
> The paper cited below is more relevant to Kolmogorov complexity than
> Solomonoff induction.  I had thought about the use of subroutines
> before I wrote my questioning critique of Solomonoff Induction.
>
> Nothing in it seems to deal with the fact that the descriptive length
> of reality's computations that create an event (the descriptive length
> that is more likely to affect the event's probability), is not
> necessarily correlated with the descriptive length of sensations we
> receive from such events.

Edward -

I'm sorry I'm not going to be able to provide much illumination for you at
this time

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Cool!


-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 12:56 PM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?




Yeah, we use Occam's razor heuristics in Novamente, and they are commonly
used throughout AI.  For instance in evolutionary program learning one
uses a "parsimony pressure" which automatically rates smaller program
trees as more fit...

ben


On Nov 8, 2007 12:21 PM, Edward W. Porter <[EMAIL PROTECTED]> wrote:


BEN>>>> However, the current form of AIXI-related math theory gives zero
guidance regarding how to make  a practical AGI.
ED>>>> Legg's Solomonoff Induction paper did suggest some down and dirty
hacks, such as Occam's razor.  It woud seem a Novamente-class machine
could do a quick backward chaining of preconditions and their
probabilities to guestimate probabilities.  That would be a rough function
of a complexity measure.  But actually it wold be something much better
because it would be concerned not only with the complexity of elements
and/or sub-events and their relationships but also so their probabilities
and that of their relationships.

Edward W. Porter
Porter & Associates


24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]





-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:52 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?



BEN>>>> [referring the Vlad's statement that about AIXI's
uncomputability]"Now now, it doesn't require infinite resources -- the
AIXItl variant of AIXI only requires an insanely massive amount of
resources, more than would be feasible in the physical universe, but not
an infinite amount ;-) "

ED>>>> So, from a practical standpoint, which is all I really care about,
is it a dead end?


"Dead end" would be too strong IMO, though others might disagree.

However, the current form of AIXI-related math theory gives zero guidance
regarding how to make  a practical AGI.  To get practical guidance out of
that theory would require some additional, extremely profound math
breakthroughs, radically different in character from the theory as it
exists right now.  This could happen.  I'm not counting on it, and I've
decided not to spend time working on it personally, as fascinating as the
subject area is to me.




Also, do you, or anybody know, if  Solmononoff (the only way I can
remember the name is "Soul man on off" like Otis Redding with a microphone
problem) Induction have the ability of deal with deep forms of non-literal
similarity matching in is complexity calculations.  And is so how?  And if
not, isn't it brain dead?  And if it is a brain dead why is such a bright
guy as Shane Legg spending his time on it.


Solomonoff induction is mentally all-powerful.  But it requires infinitely
much computational resources to achieve this ubermentality.

-- Ben G

  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/? <http://v2.listbox.com/member/?&;> &

  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:

http://v2.listbox.com/member/? <http://v2.listbox.com/member/?&;> &


  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
<http://v2.listbox.com/member/?&;
> &

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=63016185-a53e2e

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
> Jef,
>
> The paper cited below is more relevant to Kolmogorov complexity than
> Solomonoff induction.  I had thought about the use of subroutines before I
> wrote my questioning critique of Solomonoff Induction.
>
> Nothing in it seems to deal with the fact that the descriptive length of
> reality's computations that create an event (the descriptive length that
> is more likely to affect the event's probability), is not necessarily
> correlated with the descriptive length of sensations we receive from such
> events.

Edward -

I'm sorry I'm not going to be able to provide much illumination for
you at this time.  Just the few sentences of yours quoted above, while
of a level of comprehension equal or better than average on this list,
demonstrate epistemological incoherence to the extent I would hardly
know where to begin.

This discussion reminds me of hot rod enthusiasts arguing passionately
about how to build the best racing car, while denigrating any
discussion of entropy as outside the "practical."

Solomonoff induction says something fundamental about intelligence,
contributing a framework upon which more practical pieces must fit,
while saying almost nothing about the specific nature of those pieces.
 Interesting how self-referential is that statement.

"I had thought about the use of subroutines..."

You might want to consider then the theoretical relationship between
"subroutines" and recursive structures.


"...descriptive length of reality's computations..."

More coherently, you can talk only of the descriptive length of a
particular model of 'reality.'


"...that create an event..."

"Events" are very much a property of the observer, having no separate
ontological status.


"...affect the event's probability..."

More coherently, you might restate this as "...reflect the event's
likelihood..."


"...the descriptive length of sensations we receive..."

Who is this "we" that "receives" sensations?  Holy homunculus, Batman,
seems we have a bit of qualia confusion thrown into the mix!


Sorry for my glaring lack of tact.  I mean no disrespect but am short
of time.  Along the lines of the Solomonoff induction of which we
speak, it doesn't much matter where you start; as along as you keep
poking at it you'll tend to get closer to a coherent model of
'reality'.

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62947309-aab453


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Benjamin Goertzel
Yeah, we use Occam's razor heuristics in Novamente, and they are commonly
used throughout AI.  For instance in evolutionary program learning one uses
a "parsimony pressure" which automatically rates smaller program trees as
more fit...

ben

On Nov 8, 2007 12:21 PM, Edward W. Porter <[EMAIL PROTECTED]> wrote:

>  BEN>>>> However, the current form of AIXI-related math theory gives zero
> guidance regarding how to make  a practical AGI.
> ED>>>> Legg's Solomonoff Induction paper did suggest some down and dirty
> hacks, such as Occam's razor.  It woud seem a Novamente-class machine could
> do a quick backward chaining of preconditions and their probabilities to
> guestimate probabilities.  That would be a rough function of a complexity
> measure.  But actually it wold be something much better because it would be
> concerned not only with the complexity of elements and/or sub-events and
> their relationships but also so their probabilities and that of their
> relationships.
>
> Edward W. Porter
> Porter & Associates
> 24 String Bridge S12
> Exeter, NH 03833
> (617) 494-1722
> Fax (617) 494-1822
> [EMAIL PROTECTED]
>
>  -Original Message-
> *From:* Benjamin Goertzel [mailto:[EMAIL PROTECTED]
> *Sent:* Thursday, November 08, 2007 11:52 AM
> *To:* agi@v2.listbox.com
> *Subject:* Re: [agi] How valuable is Solmononoff Induction for real world
> AGI?
>
>   BEN>>>> [referring the Vlad's statement that about AIXI's
> > uncomputability]"Now now, it doesn't require infinite resources -- the
> > AIXItl variant of AIXI only requires an insanely massive amount of
> > resources, more than would be feasible in the physical universe, but not an
> > infinite amount ;-) "
> >
> > ED>>>> So, from a practical standpoint, which is all I really care
> > about, is it a dead end?
> >
>
> "Dead end" would be too strong IMO, though others might disagree.
>
> However, the current form of AIXI-related math theory gives zero guidance
> regarding how to make  a practical AGI.  To get practical guidance out of
> that theory would require some additional, extremely profound math
> breakthroughs, radically different in character from the theory as it exists
> right now.  This could happen.  I'm not counting on it, and I've decided not
> to spend time working on it personally, as fascinating as the subject area
> is to me.
>
>
> >  Also, do you, or anybody know, if  Solmononoff (the only way I can
> > remember the name is "Soul man on off" like Otis Redding with a microphone
> > problem) Induction have the ability of deal with deep forms of non-literal
> > similarity matching in is complexity calculations.  And is so how?  And if
> > not, isn't it brain dead?  And if it is a brain dead why is such a bright
> > guy as Shane Legg spending his time on it.
> >
>
> Solomonoff induction is mentally all-powerful.  But it requires infinitely
> much computational resources to achieve this ubermentality.
>
> -- Ben G
> --
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;
>
> --
> This list is sponsored by AGIRI: http://www.agiri.org/email
> To unsubscribe or change your options, please go to:
> http://v2.listbox.com/member/?&;
>

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62945314-e0a234

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
BEN>>>> However, the current form of AIXI-related math theory gives zero
guidance regarding how to make  a practical AGI.
ED>>>> Legg's Solomonoff Induction paper did suggest some down and dirty
hacks, such as Occam's razor.  It woud seem a Novamente-class machine
could do a quick backward chaining of preconditions and their
probabilities to guestimate probabilities.  That would be a rough function
of a complexity measure.  But actually it wold be something much better
because it would be concerned not only with the complexity of elements
and/or sub-events and their relationships but also so their probabilities
and that of their relationships.

Edward W. Porter
Porter & Associates
24 String Bridge S12
Exeter, NH 03833
(617) 494-1722
Fax (617) 494-1822
[EMAIL PROTECTED]



-Original Message-
From: Benjamin Goertzel [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:52 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?



BEN>>>> [referring the Vlad's statement that about AIXI's
uncomputability]"Now now, it doesn't require infinite resources -- the
AIXItl variant of AIXI only requires an insanely massive amount of
resources, more than would be feasible in the physical universe, but not
an infinite amount ;-) "

ED>>>> So, from a practical standpoint, which is all I really care about,
is it a dead end?


"Dead end" would be too strong IMO, though others might disagree.

However, the current form of AIXI-related math theory gives zero guidance
regarding how to make  a practical AGI.  To get practical guidance out of
that theory would require some additional, extremely profound math
breakthroughs, radically different in character from the theory as it
exists right now.  This could happen.  I'm not counting on it, and I've
decided not to spend time working on it personally, as fascinating as the
subject area is to me.




Also, do you, or anybody know, if  Solmononoff (the only way I can
remember the name is "Soul man on off" like Otis Redding with a microphone
problem) Induction have the ability of deal with deep forms of non-literal
similarity matching in is complexity calculations.  And is so how?  And if
not, isn't it brain dead?  And if it is a brain dead why is such a bright
guy as Shane Legg spending his time on it.


Solomonoff induction is mentally all-powerful.  But it requires infinitely
much computational resources to achieve this ubermentality.

-- Ben G

  _

This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?
<http://v2.listbox.com/member/?&;
> &

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62930454-c803ca

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
Lukasz,

Thanks.  You have given me the best response to my questions yet.  I would
like to think about it before I respond in any length and I have to go
off-line soon to pay the bills.

Ed Porter



-Original Message-
From: Lukasz Stafiniak [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 11:54 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
>
>
> VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID
>
> VLADIMIR>>>> "Hutter shows that prior can be selected rather
> VLADIMIR>>>> arbitrarily
> without giving up too much"

BTW: There is a point in Hutter's book that I don't fully understand: the
belief contamination theorem. Is the contamination reintroduced at each
cycle in this theorem? (The only way it makes sense.)
>
> (However, I have read that for complex probability distributions the
> choice of the class of mathematical model you use to model the
> distribution is part of the prior choosing issue, and can be important
> — but that did not seem to be addressed in the Solomonoff Induction
> paper.  For example in some speech recognition each of the each speech
> frame model has a pre-selected number of dimensions, such as FFT bins
> (or related signal processing derivatives), and each dimension is not
> represented by a Gausian but rather by a basis function comprised of a
> set of a selected number of Gausians.)

Yes. The choice of Solomonoff and Hutter is to take a distribution over
all computable things.
>
> It seems to me that when you don't have much frequency data, we humans
> normally make a guess based on the probability of similar things, as
> suggested in the Kemp paper I cited.It seems to me that is by far
the
> most commonsensical approach.  In fact, due to the virtual
> omnipreseance of non-literal similarity in everything we see and hear,
> (e.g., the same face virtually never hits V1 exactly the same) most of
> our probabilistic thinking is dominated by similarity derived
> probabilities.
>
I think the main point is: Bayesian reasoning is about conditional
distributions, and Solomonoff / Hutter's work is about conditional
complexities. (Although directly taking conditional Kolmogorov complexity
didn't work, there is a paragraph about this in Hutter's
book.) When you build a posterior over TMs from all that vision data using
the universal prior, you are looking for the simplest cause, you get "the
probability of similar things", because similar things can be simply
transformed into the thing under question, moreover you get it summed with
the probability of things that are similar in the induced model space.

>
> ED>>>> So, from a practical standpoint, which is all I really care
> ED>>>> about, is
> it a dead end?
>
> Also, do you, or anybody know, if  Solmononoff (the only way I can
> remember the name is "Soul man on off" like Otis Redding with a
> microphone problem)

You scared me... Check again, it's like in Solomon the king.

> Induction have the ability of deal with deep forms of non-literal
> similarity matching in is complexity calculations.  And is so how?
> And if not, isn't it brain dead?  And if it is a brain dead why is
> such a bright guy as Shane Legg spending his time on it.
>
Yes, it is all about non-literal similarity matching, like you said in
later post, finding a library that makes for very short codes for a class
of similar things.

OK I must post now or I'll get lost in other posts ;-)

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62925711-604ba4


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Lukasz Stafiniak
On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
>
>
> VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID
>
> VLADIMIR "Hutter shows that prior can be selected rather arbitrarily
> without giving up too much"

BTW: There is a point in Hutter's book that I don't fully understand:
the belief contamination theorem. Is the contamination reintroduced at
each cycle in this theorem? (The only way it makes sense.)
>
> (However, I have read that for complex probability distributions the choice
> of the class of mathematical model you use to model the distribution is part
> of the prior choosing issue, and can be important — but that did not seem to
> be addressed in the Solomonoff Induction paper.  For example in some speech
> recognition each of the each speech frame model has a pre-selected number of
> dimensions, such as FFT bins (or related signal processing derivatives), and
> each dimension is not represented by a Gausian but rather by a basis
> function comprised of a set of a selected number of Gausians.)

Yes. The choice of Solomonoff and Hutter is to take a distribution
over all computable things.
>
> It seems to me that when you don't have much frequency data, we humans
> normally make a guess based on the probability of similar things, as
> suggested in the Kemp paper I cited.It seems to me that is by far the
> most commonsensical approach.  In fact, due to the virtual omnipreseance of
> non-literal similarity in everything we see and hear, (e.g., the same face
> virtually never hits V1 exactly the same) most of our probabilistic thinking
> is dominated by similarity derived probabilities.
>
I think the main point is: Bayesian reasoning is about conditional
distributions, and Solomonoff / Hutter's work is about conditional
complexities. (Although directly taking conditional Kolmogorov
complexity didn't work, there is a paragraph about this in Hutter's
book.) When you build a posterior over TMs from all that vision data
using the universal prior, you are looking for the simplest cause, you
get "the probability of similar things", because similar things can be
simply transformed into the thing under question, moreover you get it
summed with the probability of things that are similar in the induced
model space.

>
> ED So, from a practical standpoint, which is all I really care about, is
> it a dead end?
>
> Also, do you, or anybody know, if  Solmononoff (the only way I can remember
> the name is "Soul man on off" like Otis Redding with a microphone problem)

You scared me... Check again, it's like in Solomon the king.

> Induction have the ability of deal with deep forms of non-literal similarity
> matching in is complexity calculations.  And is so how?  And if not, isn't
> it brain dead?  And if it is a brain dead why is such a bright guy as Shane
> Legg spending his time on it.
>
Yes, it is all about non-literal similarity matching, like you said in
later post, finding a library that makes for very short codes for a
class of similar things.

OK I must post now or I'll get lost in other posts ;-)

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62921326-deb5ed

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Benjamin Goertzel
>
> BEN [referring the Vlad's statement that about AIXI's
> uncomputability]"Now now, it doesn't require infinite resources -- the
> AIXItl variant of AIXI only requires an insanely massive amount of
> resources, more than would be feasible in the physical universe, but not an
> infinite amount ;-) "
>
> ED So, from a practical standpoint, which is all I really care about,
> is it a dead end?
>

"Dead end" would be too strong IMO, though others might disagree.

However, the current form of AIXI-related math theory gives zero guidance
regarding how to make  a practical AGI.  To get practical guidance out of
that theory would require some additional, extremely profound math
breakthroughs, radically different in character from the theory as it exists
right now.  This could happen.  I'm not counting on it, and I've decided not
to spend time working on it personally, as fascinating as the subject area
is to me.


>  Also, do you, or anybody know, if  Solmononoff (the only way I can
> remember the name is "Soul man on off" like Otis Redding with a microphone
> problem) Induction have the ability of deal with deep forms of non-literal
> similarity matching in is complexity calculations.  And is so how?  And if
> not, isn't it brain dead?  And if it is a brain dead why is such a bright
> guy as Shane Legg spending his time on it.
>

Solomonoff induction is mentally all-powerful.  But it requires infinitely
much computational resources to achieve this ubermentality.

-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62920482-657287

RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter

Jef,

The paper cited below is more relevant to Kolmogorov complexity than
Solomonoff induction.  I had thought about the use of subroutines before I
wrote my questioning critique of Solomonoff Induction.

Nothing in it seems to deal with the fact that the descriptive length of
reality’s computations that create an event (the descriptive length that
is more likely to affect the event’s probability), is not necessarily
correlated with the descriptive length of sensations we receive from such
events.  Nor is it clear that it deals with the fact that much of the
frequency data a world-sensing brain derives its probabilities from is
full of non-literal similarity, meaning that non-literal matching is a key
component of any capable AGI.  It does not indicate how the complexity of
that non-literal matching, at the sensation, rather than the reality
generating level, is to be dealt which by Solomonoff Indicution, is it
part of the complexity involved in its hypothesis (or semi-measurs) or
not, and to what if any extent should it be?

With regard to the paper you cited I disagree with its statement that the
measure of the complexity of a program written using a library should be
the size of the program and the size of the library is uses.  Presumably
this was a mis-statement, because it would make all but the very largest
programs that used the same vast library relatively close in size,
regardless of the relative complexity of what they do.  I assume it really
should be the length of the program plus only each of the library routines
it actually uses, independent of how many times it uses them. Anything
else would mean that

To make this discussion relevant to practical AGI, lets assume the program
from which Kolmogorov complexity is computed is a Novamente-class machine
up and running with world knowledge in say five to ten years.  Assume the
system has compositional and generalizational hierarchies providing it
with the representational efficiencies Jeff Hawkins describes for
hierarchical memory.

In such a system much of what determines what happens lies in its
knowledge base,  I assume the length of any knowledge base components used
would also have to be counted in the Kolmogorov complexity.

But would one only count the knowledge structures actually found to match,
or also the ones that were match candidates, but lost out, when
calculating such complexity?  Any ideas?

Ed Porter

-Original Message-
From: Jef Allbright [mailto:[EMAIL PROTECTED]
Sent: Thursday, November 08, 2007 9:56 AM
To: agi@v2.listbox.com
Subject: Re: [agi] How valuable is Solmononoff Induction for real world
AGI?


I recently found this paper to contain some thinking worthwhile to the
considerations in this thread.

<http://lcsd05.cs.tamu.edu/papers/veldhuizen.pdf>

- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62919265-6d3337


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, YKY (Yan King Yin) <[EMAIL PROTECTED]> wrote:
>
> Thanks for the input.
>
> There's one perplexing theorem, in the paper about the algorithmic
> complexity of programming, that "the language doesn't matter that much", ie,
> the algorithmic complexity of a program in different languages only differ
> by a constant.  I've heard something similar about the choice of Turing
> machines only affect the Kolmogorov complexity by a constant.  (I'll check
> out the proof of this one later.)


This only works if the languages are are Turing Complete so that they
can append a description of a program that converts from the language
in question to its native one, in front of the non-native program.

Also constant might not mean negligable. 2^^^9 is a constant (where ^
is knuth's up arrow notation).

> But it seems to suggest that the choice of the AGI's KR doesn't matter.  It
> can be logic, neural network, or java?  That's kind of a strange
> conclusion...
>

Only some neural networks are Turing complete. First order logic
should be, prepositional logic not so much.

 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62912284-88dadd


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread John G. Rose
> From: Jef Allbright [mailto:[EMAIL PROTECTED]
> 
> I recently found this paper to contain some thinking worthwhile to the
> considerations in this thread.
> 
> 
> 

This is an excellent paper not in only the subject of code reuse but also of
the techniques and tools used to tackle such a complicated issue. Code reuse
is related to code generation as some AGIs would make use of, or any other
type of language generation, formal or whatever.

John

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62909084-8cc6c9


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread YKY (Yan King Yin)
Thanks for the input.

There's one perplexing theorem, in the paper about the algorithmic
complexity of programming, that "the language doesn't matter that much", ie,
the algorithmic complexity of a program in different languages only differ
by a constant.  I've heard something similar about the choice of Turing
machines only affect the Kolmogorov complexity by a constant.
(I'll check out the proof of this one later.)

But it seems to suggest that the choice of the AGI's KR doesn't matter.  It
can be logic, neural network, or java?  That's kind of
a strange conclusion...

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62888479-02c7e4

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread William Pearson
On 08/11/2007, YKY (Yan King Yin) <[EMAIL PROTECTED]> wrote:
>
> My impression is that most machine learning theories assume a search space
> of hypotheses as a given, so it is out of their scope to compare *between*
> learning structures (eg, between logic and neural networks).
>
> Algorithmic learning theory - I don't know much about it - may be useful
> because it does not assume a priori a learning structure (except that of a
> Turing machine), but then the algorithmic complexity is incomputable.
>
> Is there any research that can tell us what kind of structures are better
> for machine learning?

Not if all problems are equi-probable.
http://en.wikipedia.org/wiki/No_free_lunch_in_search_and_optimization

However this is unlikely in the real world.

It does however give an important lesson, put as much information as
you have about the problem domain into the algorithm and
representation as possible, if you want to be at all efficient.

This form of learning is only a very small part of what humans do when
we learn things. For example when we learn to play chess, we are told
or read the rules of chess and the winning conditions. This allows us
to create tentative learning strategies/algorithms that are much
better than random at playing the game and also giving us good
information about the game. Which is how we generally deal with
combinatorial explosions.

Consider a probabilistic learning system based on statements about the
real world TM, without this ability to alter how it learns and what it
tries, it would be looking at the probability of whether a bird
tweeting is correlated with his opponent winning, and also trying to
figure out whether emptying an ink well over the board is a valid
move.

I think Marcus Hutter has a bit about how slow AIXI would be at
learning chess somewhere in writings, due to only getting a small
amounts of information (1 bit ?) per game about the problem domain. My
memory might be faulty and I don't have time to dig at the moment

>  Or perhaps w.r.t a certain type of data?  Are there
> learning structures that will somehow "learn things faster"?

Thinking in terms of fixed learning structures is IMO a mistake.
Interstingly AIXI doesn't have fixed learning structures per se, even
though it might appear to. Because it stores the entire history of the
agent and feeds it to each program under evaluation, each of these may
be a learning program and be able to create learning strategies from
that data. You would have to wait a long time for these types of
programs to become the most probable if a good prior was not given to
the system though.


 Will Pearson

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62882969-3d3172


RE: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Edward W. Porter
VLADIMIR NESOV IN HIS  11/07/07 10:54 PM POST SAID

VLADIMIR “Hutter shows that prior can be selected rather arbitrarily
without giving up too much”

ED Yes.  I was wondering why the Solomonoff Induction paper made such
a big stink about picking the prior (and then came up which a choice that
struck me as being quite sub-optimal in most of the types of situations
humans deal with).  After you have a lot of data, you can derive the
equivalent of the prior from frequency data.  As the Solmononoff Induction
paper showed, using Bayesian formulas the effect of the prior fades off
fairly fast as data comes in.

(However, I have read that for complex probability distributions the
choice of the class of mathematical model you use to model the
distribution is part of the prior choosing issue, and can be important —
but that did not seem to be addressed in the Solomonoff Induction paper.
For example in some speech recognition each of the each speech frame model
has a pre-selected number of dimensions, such as FFT bins (or related
signal processing derivatives), and each dimension is not represented by a
Gausian but rather by a basis function comprised of a set of a selected
number of Gausians.)

It seems to me that when you don’t have much frequency data, we humans
normally make a guess based on the probability of similar things, as
suggested in the Kemp paper I cited.It seems to me that is by far the
most commonsensical approach.  In fact, due to the virtual omnipreseance
of non-literal similarity in everything we see and hear, (e.g., the same
face virtually never hits V1 exactly the same) most of our probabilistic
thinking is dominated by similarity derived probabilities.

BEN GOERTZEL WROTE IN HIS Thu 11/8/2007 6:32 AM POST

BEN [referring the Vlad’s statement that about AIXI’s
uncomputability]“Now now, it doesn't require infinite resources -- the
AIXItl variant of AIXI only requires an insanely massive amount of
resources, more than would be feasible in the physical universe, but not
an infinite amount ;-) “

ED So, from a practical standpoint, which is all I really care about,
is it a dead end?

Also, do you, or anybody know, if  Solmononoff (the only way I can
remember the name is “Soul man on off” like Otis Redding with a microphone
problem) Induction have the ability of deal with deep forms of non-literal
similarity matching in is complexity calculations.  And is so how?  And if
not, isn’t it brain dead?  And if it is a brain dead why is such a bright
guy as Shane Legg spending his time on it.


YAN KINK YIN IN HIS 11/8/2007 9:16 AM POST SAID

YAN Is there any research that can tell us what kind of structures are
better for machine learning?  Or perhaps w.r.t a certain type of data?
Are there learning structures that will somehow "learn things faster"?

ED  Yes, brain science.  It may not point out the best possible
architecture, but it points out one that works.  Evolution is not
theoretical, and not totally optimal, but it is practical.  Systems like
Novamente which is loosely based on many key ideas from brain science
probably have a much more likely chance of getting useful stuff up and
running soon that any more theortical approaches, because the search space
has already been narrowed by many trillions of trials and errors over
hundreds of millions of years.

Ed Porter

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62880190-75103d

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Jef Allbright
I recently found this paper to contain some thinking worthwhile to the
considerations in this thread.



- Jef

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62868749-8ed517


Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Benjamin Goertzel
>
>
>
> Is there any research that can tell us what kind of structures are better
> for machine learning?  Or perhaps w.r.t a certain type of data?  Are there
> learning structures that will somehow "learn things faster"?
>

There is plenty of knowledge about which learning algorithms are better for
which problem classes.

For example, there are problems known to be deceptive (not efficiently
solved) for genetic programming, yet that are known to be efficiently
solvable by MOSES, the probabilistic program learning method used in
Novamente (from Moshe Looks' PhD thesis, see metacog.org)




>
> Note that, if the answer is negative, then the choice of learning
> structures is arbitrary and we should choose the most developed / heavily
> researched ones (such as first-order logic).
>
>

The choice is not at all arbitrary; but the knowledge we have to guide the
choice is currently very incomplete.  So one has to make the right intuitive
choice based on integrating the available information.  This is part of why
AGI is hard at the current level of development of computer science.


-- Ben G

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62863431-4f3cc5

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread YKY (Yan King Yin)
My impression is that most machine learning theories assume a search space
of hypotheses as a given, so it is out of their scope to compare *between*
learning structures (eg, between logic and neural networks).

Algorithmic learning theory - I don't know much about it - may be useful
because it does not assume a priori a learning structure (except that of a
Turing machine), but then the algorithmic complexity is incomputable.

Is there any research that can tell us what kind of structures are better
for machine learning?  Or perhaps w.r.t a certain type of data?  Are there
learning structures that will somehow "learn things faster"?

Note that, if the answer is negative, then the choice of learning structures
is arbitrary and we should choose the most developed / heavily researched
ones (such as first-order logic).

YKY

-
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=62846736-33363c

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-08 Thread Benjamin Goertzel
Now now, it doesn't require infinite resources -- the AIXItl variant of AIXI
only requires an insanely massive amount of resources, more than would be
feasible in the physical universe, but not an infinite amount ;-)

I think this sort of mathematical theory of intelligence is useful in
helping to clarify some of the conceptual foundations of the theory of
general intelligence and its relationship to computation theory.

However, there is an awful lot that it can't answer, and among this mass is
the question of how to create reasonable levels of general intelligence
using feasible amounts of computational resources (like the ones available
in the actual physical universe of which we know).

-- Ben G

On Nov 7, 2007 10:53 PM, Vladimir Nesov <[EMAIL PROTECTED]> wrote:

> Edward,
>
> I think the point of Legg's and Hutter's work is in describing
> completely defined solution. So far they only have a completely
> defined solution which requires infinite resources. It can probably be
> generalized somewhere, for example Hutter shows that prior can be
> selected rather arbitrarily without giving up too much, so probably
> there's a class of feasible models that can be described in a similar
> framework.
>
> On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?
> >
> > Through some AGIRI links I stumbled on the Shane Legg's "Friendly AI is
> > bunk" page.  I was impressed.   I read more of his related web pages and
> > became very impressed.
> >
> > Then I read his paper "Solmononoff Induction."
> > (http://www.vetta.org/documents/disSol.pdf )
> >
> > It left me confused.  I don't have a sufficient math background to
> > understand all its notation, so I know I may well be wrong (I mean
> totally
> > wrong), but to me Solmononoff Induction, as described in the paper
> seemed to
> > be missing something big, at least if it were to be used for a general
> > purpose AGI that was to learn from perception in the real world.
> >
> > I understand the basic idea that if you are seeking a prior likelihood
> for
> > the occurrence of an event and you have no data about the frequency of
> its
> > occurrence -- absent any other knowledge -- some notion of the
> complexity,
> > in information-theory terms, of the event might help you make a better
> > guess.  This makes sense because reality is a big computer, and
> complexity
> > -- in terms of the number of combined events required to make reality
> cough
> > up a given event , and the complexity of the space in which those events
> are
> > to be combined -- should to some extent be related to the event's
> > probability.  I can understand how such complexity could be approximated
> by
> > the length of code required in some a theoretical Universal computer to
> > model such real world event-occurrence complexity.
> >
> > But it seems to me there are factors other than the complexity of
> > representing or computing a match against a hypothesis that --  in the
> an
> > AGI sensing and acting in a real world --  might be much more important
> and
> > functional for estimating probabilities.
> >
> > For example, although humans are not good at accurately estimating
> certain
> > types of probabilities, we have an innate ability to guess probabilities
> in
> > a context sensitive way and by using knowledge about similar yet
> different
> > things.  (As shown in the example quoted below from the Kemp paper
> below.)
> > It would seem to me that the complexity of the computation required to
> > understand what is the appropriate context is not, itself, necessarily
> > related to the complexity of an event occurring in that context once the
> > context has been determined.
> >
> > Similarly, it does not seem to me that the complexity of determining
> what is
> > similar to what, in which ways, for purpose of determining the extent to
> > which probabilities from something similar might provide an appropriate
> > prior for something never-before-seen, are not necessarily related to
> the
> > probability of the thing never-before-seen occurring, itself.  For
> example,
> > the complexity of perception is not directly related to the complexity
> of
> > occurrence.  For example, complexity of perception can be greatly
> affected
> > by changes in light, shape, and view which might not have anything to do
> > with the probability or complexity of occurrence.
> >
> > So what I am saying is, for example, that if you are receiving a
> sequence of
> > bytes from a video camera, much of the complexity in the input stream
> might
> > not be related to complexity-of-event-creation- or Occam's-razor-type
> > issues, but rather to complexity of perception, or similarity
> understanding,
> > or of appropriate context selection, factors which are not themselves
> > necessarily related to complexity of occurrence.
> >
> > Furthermore, I am saying that for an AGI it seems to me it would make
> much
> > more sense to attemp

Re: [agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-07 Thread Vladimir Nesov
Edward,

I think the point of Legg's and Hutter's work is in describing
completely defined solution. So far they only have a completely
defined solution which requires infinite resources. It can probably be
generalized somewhere, for example Hutter shows that prior can be
selected rather arbitrarily without giving up too much, so probably
there's a class of feasible models that can be described in a similar
framework.

On 11/8/07, Edward W. Porter <[EMAIL PROTECTED]> wrote:
>
>
>
> HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?
>
> Through some AGIRI links I stumbled on the Shane Legg's "Friendly AI is
> bunk" page.  I was impressed.   I read more of his related web pages and
> became very impressed.
>
> Then I read his paper "Solmononoff Induction."
> (http://www.vetta.org/documents/disSol.pdf )
>
> It left me confused.  I don't have a sufficient math background to
> understand all its notation, so I know I may well be wrong (I mean totally
> wrong), but to me Solmononoff Induction, as described in the paper seemed to
> be missing something big, at least if it were to be used for a general
> purpose AGI that was to learn from perception in the real world.
>
> I understand the basic idea that if you are seeking a prior likelihood for
> the occurrence of an event and you have no data about the frequency of its
> occurrence -- absent any other knowledge -- some notion of the complexity,
> in information-theory terms, of the event might help you make a better
> guess.  This makes sense because reality is a big computer, and complexity
> -- in terms of the number of combined events required to make reality cough
> up a given event , and the complexity of the space in which those events are
> to be combined -- should to some extent be related to the event's
> probability.  I can understand how such complexity could be approximated by
> the length of code required in some a theoretical Universal computer to
> model such real world event-occurrence complexity.
>
> But it seems to me there are factors other than the complexity of
> representing or computing a match against a hypothesis that --  in the an
> AGI sensing and acting in a real world --  might be much more important and
> functional for estimating probabilities.
>
> For example, although humans are not good at accurately estimating certain
> types of probabilities, we have an innate ability to guess probabilities in
> a context sensitive way and by using knowledge about similar yet different
> things.  (As shown in the example quoted below from the Kemp paper below.)
> It would seem to me that the complexity of the computation required to
> understand what is the appropriate context is not, itself, necessarily
> related to the complexity of an event occurring in that context once the
> context has been determined.
>
> Similarly, it does not seem to me that the complexity of determining what is
> similar to what, in which ways, for purpose of determining the extent to
> which probabilities from something similar might provide an appropriate
> prior for something never-before-seen, are not necessarily related to the
> probability of the thing never-before-seen occurring, itself.  For example,
> the complexity of perception is not directly related to the complexity of
> occurrence.  For example, complexity of perception can be greatly affected
> by changes in light, shape, and view which might not have anything to do
> with the probability or complexity of occurrence.
>
> So what I am saying is, for example, that if you are receiving a sequence of
> bytes from a video camera, much of the complexity in the input stream might
> not be related to complexity-of-event-creation- or Occam's-razor-type
> issues, but rather to complexity of perception, or similarity understanding,
> or of appropriate context selection, factors which are not themselves
> necessarily related to complexity of occurrence.
>
> Furthermore, I am saying that for an AGI it seems to me it would make much
> more sense to attempt to derive priors from notions of similarity, of
> probabilities of similar things, events, and contexts, and from things like
> causal models for similar or generalized classes.  There is usually much
> from reality that we do know that we can, and do, use when learning about
> things we don't know.
>
> A very good paper on this subject is one by Charles Kemp et al., "Learning
> overhypotheses with hierarchical Bayesian models" at
> http://www.mit.edu/~perfors/KempPTDevSci.pdf   It give a
> very good example of the power of this type of reasoning -- power that it
> appears to me Solomonoff Induction totally lacks
>
>
> "participants were asked to imagine that they were exploring an island in
> the Southeastern Pacific, that they had encountered a single member of the
> Barratos tribe, and that this tribesman was brown and obese. Based on this
> single example, participants concluded that most Barratos were brown, but
> gave a much lower estimate of the pro

[agi] How valuable is Solmononoff Induction for real world AGI?

2007-11-07 Thread Edward W. Porter
HOW VALUABLE IS SOLMONONOFF INDUCTION FOR REAL WORLD AGI?

Through some AGIRI links I stumbled on the Shane Legg's "Friendly AI is
bunk" page.  I was impressed.   I read more of his related web pages and
became very impressed.

Then I read his paper "Solmononoff Induction."
(http://www.vetta.org/documents/disSol.pdf )

It left me confused.  I don't have a sufficient math background to
understand all its notation, so I know I may well be wrong (I mean totally
wrong), but to me Solmononoff Induction, as described in the paper seemed
to be missing something big, at least if it were to be used for a general
purpose AGI that was to learn from perception in the real world.

I understand the basic idea that if you are seeking a prior likelihood for
the occurrence of an event and you have no data about the frequency of its
occurrence -- absent any other knowledge -- some notion of the complexity,
in information-theory terms, of the event might help you make a better
guess.  This makes sense because reality is a big computer, and complexity
-- in terms of the number of combined events required to make reality
cough up a given event , and the complexity of the space in which those
events are to be combined -- should to some extent be related to the
event's probability.  I can understand how such complexity could be
approximated by the length of code required in some a theoretical
Universal computer to model such real world event-occurrence complexity.

But it seems to me there are factors other than the complexity of
representing or computing a match against a hypothesis that --  in the an
AGI sensing and acting in a real world --  might be much more important
and functional for estimating probabilities.

For example, although humans are not good at accurately estimating certain
types of probabilities, we have an innate ability to guess probabilities
in a context sensitive way and by using knowledge about similar yet
different things.  (As shown in the example quoted below from the Kemp
paper below.)  It would seem to me that the complexity of the computation
required to understand what is the appropriate context is not, itself,
necessarily related to the complexity of an event occurring in that
context once the context has been determined.

Similarly, it does not seem to me that the complexity of determining what
is similar to what, in which ways, for purpose of determining the extent
to which probabilities from something similar might provide an appropriate
prior for something never-before-seen, are not necessarily related to the
probability of the thing never-before-seen occurring, itself.  For
example, the complexity of perception is not directly related to the
complexity of occurrence.  For example, complexity of perception can be
greatly affected by changes in light, shape, and view which might not have
anything to do with the probability or complexity of occurrence.

So what I am saying is, for example, that if you are receiving a sequence
of bytes from a video camera, much of the complexity in the input stream
might not be related to complexity-of-event-creation- or
Occam's-razor-type issues, but rather to complexity of perception, or
similarity understanding, or of appropriate context selection, factors
which are not themselves necessarily related to complexity of occurrence.

Furthermore, I am saying that for an AGI it seems to me it would make much
more sense to attempt to derive priors from notions of similarity, of
probabilities of similar things, events, and contexts, and from things
like causal models for similar or generalized classes.  There is usually
much from reality that we do know that we can, and do, use when learning
about things we don't know.

A very good paper on this subject is one by Charles Kemp et al., "Learning
overhypotheses with hierarchical Bayesian models" at
http://www.mit.edu/~perfors/KempPTDevSci.pdf .  It give a very good
example of the power of this type of reasoning -- power that it appears to
me Solomonoff Induction totally lacks

"participants were asked to imagine that they were
exploring an island in the Southeastern Pacific, that they had encountered
a single member of the Barratos tribe, and that this tribesman was brown
and obese. Based on this single example, participants concluded that most
Barratos were brown, but gave a much lower estimate of the proportion of
obese Barratos (Figure 4). When asked to justify their responses,
participants often said that tribespeople were homogeneous with respect to
color but heterogeneous with respect to body weight (Nisbett et al.,
1983)."

Perhaps I am totally missing something, which is very possible, and if so,
I would like to have it pointed out to me, but I think the type of
overhypothesis reasoning described in this Kemp paper is a much more
powerful and useful source for deriving priors to be used in Bayesian
reasoning in AGIs that are expected to learn in the "real world" than
Solomonoff Induction.

Since