date:20081228

Re: [agi] Introducing Steve's Theory of Everything in cognition.

2008-12-28 Thread Steve Richfield

Loosemore, et al,

Just to get this discussion out of esoteric math, here is a REALLY SIMPLE
way of doing unsupervised learning with dp/dt that looks like it ought to
work.

Suppose we record each occurrence of the inputs to a neuron, keeping
counters to identify how many times each combination has happened. For this
discussion, each input will be considered to have either a substantial
positive, substantial negative, or nearly zero dp/dt. When we reach a
threshold, of, say 20, identical occurrences of the same combination of
dp/dt that is NOT accompanied by lateral inhibition, we will proclaim THAT
to be our principal component function for that neuron to do for the rest
of its life. Thereafter, the neuron will require the previously observed
positive and negative inputs to be as programmed, but will ignore all inputs
that were nearly zero.

Of course, many frames will be corrupted because of overlapping phenomena,
sampling on a dp/dt edges, noise, fast phenomena, etc., etc. However, there
will be few if any precise repetitions of corrupted frames, whereas clean
frames should be quite common.

First the most common frame (all zeros - nothing there) will be
recognized, followed by each of the most common simultaneously occurring
temporal patterns recognized by successive neurons, all identified in order
of decreasing frequency exactly as needed for Huffman or PCA coding.

This process won't start until all inputs are accompanied by an indication
that they have already been programmed by this process, so that programming
will proceed layer by layer without corruption from inputs being only
partially developed (a common problem in multi-layer NNs).

While clever math might make this work a little faster, and certainly wet
neurons can't store many previous patterns, this should be guaranteed to
work, and produce substantially perfect unsupervised learning, albeit
probably slower than better-math methods, but probably faster than wet
neurons that can't save thousands of combinations during early programming.

Of course, this would be completely unworkable outside of dp/dt space, as in
object space, this would probably exhaust a computer's memory before
completing.

Does this get the Loosemore Certificate of No Objection as being an
apparently workable method for substantially optimal unsupervised learning?

Thanks for considering this.

Steve Richfield



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

[agi] Alternative Cicuitry

2008-12-28 Thread John G. Rose

Reading this - 

http://www.nytimes.com/2008/12/23/health/23blin.html?ref=science

 

makes me wonder what other circuitry we have that's discouraged from being
accepted.

 

John




---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-28 Thread John G. Rose

 From: Matt Mahoney [mailto:matmaho...@yahoo.com]
 
 --- On Sat, 12/27/08, John G. Rose johnr...@polyplexic.com wrote:
 
  Well I think consciousness must be some sort of out of band
 intelligence
  that bolsters an entity in terms of survival. Intelligence probably
  stratifies or optimizes in zonal regions of similar environmental
  complexity, consciousness being one or an overriding out-of-band
 one...
 
 No, consciousness only seems mysterious because human brains are
 programmed that way. For example, I should logically be able to
 convince you that pain is just a signal that reduces the probability
 of you repeating whatever actions immediately preceded it. I can't do
 that because emotionally you are convinced that pain is real.
 Emotions can't be learned the way logical facts can, so emotions always
 win. If you could accept the logical consequences of your brain being
 just a computer, then you would not pass on your DNA. That's why you
 can't.
 
 BTW the best I can do is believe both that consciousness exists and
 consciousness does not exist. I realize these positions are
 inconsistent, and I leave it at that.
 

Consciousness must be a component of intelligence. For example - to pass on
DNA for humans, they need to be conscious, or have been up to this point.
Humans only live approx. 80 years. Intelligence is really a multi-agent
thing, IOW our individual intelligence has come about through the genetic
algorithm of humanity, we are really a distributed intelligence and
theoretically AGI will be born out of that. So maybe for improved genetic
algorithms used for obtaining max compression there needs to be a
consciousness component in the agents? Just an idea I think there is
potential for distributed consciousness inside of command line compressors
:)

John



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Alternative Cicuitry

2008-12-28 Thread Richard Loosemore


John G. Rose wrote:

Reading this -

http://www.nytimes.com/2008/12/23/health/23blin.html?ref=science

 

makes me wonder what other circuitry we have that's discouraged from 
being accepted.


This blindsight news is not really news.  It has been known for decades 
that there are two separate visual pathways in the brain, which seem to 
process what information and vision for action information.


So this recent hubbub is just a new, more dramatic demonstration of 
something that has been known about for a long time.


This is my take on what is going on here:

The interesting fact is that the vision for action pathway can operate 
without conscious awareness.  It is an autopilot.  What this seems to 
imply is that at some early point in evolution there was only that 
pathway, and there was no general ability to think about higher level 
aspects of the world.


Then the higher cognitive mechanisms developed, while the older system 
remained in place.  The higher cognitive mechanisms grew their own 
system for analyzing visual input (the 'what pathway), but it turned 
out that the brain could still use the older pathway in parallel with 
the new, so it was left in place.


I am going to add this as a prediction derived from the model of 
consciousness in my AGI-09 paper:  the prediction is that when we 
uncover the exact implementation details of the analysis mechanism 
that I discussed in the paper, we will find that the AM is entirely 
within the higher cognitive system, and that the vision-for-action 
pathway just happens to be beyond the scope of what the AM can access. 
It is because it is outside that scope that no consciousness is 
associated with what that pathway does.


(Unfortunately, of course, this prediction cannot be fully tested until 
we can pin down the exact details of how the analysis mechanism gets 
implemented in the brain.  The same is true of the other predictions).






Richard Loosemore


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

RE: [agi] Universal intelligence test benchmark

2008-12-28 Thread Matt Mahoney

--- On Sun, 12/28/08, John G. Rose johnr...@polyplexic.com wrote:

 So maybe for improved genetic
 algorithms used for obtaining max compression there needs to be a
 consciousness component in the agents? Just an idea I think there is
 potential for distributed consciousness inside of command line compressors
 :)

No, consciousness (as the term is commonly used) is the large set of properties 
of human mental processes that distinguish life from death, such as ability to 
think, learn, experience, make decisions, take actions, communicate, etc. It is 
only relevant as an independent concept to agents that have a concept of death 
and the goal of avoiding it. The only goal of a compressor is to predict the 
next input symbol. 

-- Matt Mahoney, matmaho...@yahoo.com



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Introducing Steve's Theory of Everything in cognition.

2008-12-28 Thread Abram Demski

Steve,

This sort of simple solution us what makes me say that relational
learning is where real progress is to be made. That's not to say that
we shouldn't rely on past work in flat learning: a great deal of
progress has been made in that area, boosting them far beyond what
simplistic solutions can do.

Anyway, some comments on your proposal...

The method sounds more like clustering then like principle components.
I suppose it depends on exactly how the lateral inhibition behaves. If
features are allowed to combine linearly, it is PCA, but if lateral
inhibition forces only one neuron to respond to a given input, it is
clustering.

It seems unlikely that an entire visual frame will ever be repeated,
even in dp/dt space. So, I infer that when you say frame you are
thinking only of the field of inputs of an individual neuron, which
perhaps correspond to a small region on the retina. Taking the
standard route, the neurons could then be arranged in a hierarchy, so
that more abstract neurons take as input the output of less abstract
ones. But I'm not sure this would go well the way you've described
things. The top level could only recognize whole-scene-classes that
were defined by the intersection of the nonzero elements of all their
members (because each individual neuron will have this property),
which seems very limiting. This could be fixed easily enough, though,
by standard methods.

Anyway, such a hierarchy will not learn any relational concepts :P.
There are ways of getting it to learn *some* relational concepts (for
example, simply the fact  that our eyes are constantly moving will
help tremendously, since moving our eyes to different parts of the
picture is equivalent to one of the suggestions I make in the blog
post I referred you to).

It may be true that all standard PCA methods are batch mode only,
but there are standard clustering methods that do what you want (one
such method is called sparse distributed memory).

--Abram

On Sun, Dec 28, 2008 at 5:45 AM, Steve Richfield
steve.richfi...@gmail.com wrote:
 Loosemore, et al,

 Just to get this discussion out of esoteric math, here is a REALLY SIMPLE
 way of doing unsupervised learning with dp/dt that looks like it ought to
 work.

 Suppose we record each occurrence of the inputs to a neuron, keeping
 counters to identify how many times each combination has happened. For this
 discussion, each input will be considered to have either a substantial
 positive, substantial negative, or nearly zero dp/dt. When we reach a
 threshold, of, say 20, identical occurrences of the same combination of
 dp/dt that is NOT accompanied by lateral inhibition, we will proclaim THAT
 to be our principal component function for that neuron to do for the rest
 of its life. Thereafter, the neuron will require the previously observed
 positive and negative inputs to be as programmed, but will ignore all inputs
 that were nearly zero.

 Of course, many frames will be corrupted because of overlapping phenomena,
 sampling on a dp/dt edges, noise, fast phenomena, etc., etc. However, there
 will be few if any precise repetitions of corrupted frames, whereas clean
 frames should be quite common.

 First the most common frame (all zeros - nothing there) will be
 recognized, followed by each of the most common simultaneously occurring
 temporal patterns recognized by successive neurons, all identified in order
 of decreasing frequency exactly as needed for Huffman or PCA coding.

 This process won't start until all inputs are accompanied by an indication
 that they have already been programmed by this process, so that programming
 will proceed layer by layer without corruption from inputs being only
 partially developed (a common problem in multi-layer NNs).

 While clever math might make this work a little faster, and certainly wet
 neurons can't store many previous patterns, this should be guaranteed to
 work, and produce substantially perfect unsupervised learning, albeit
 probably slower than better-math methods, but probably faster than wet
 neurons that can't save thousands of combinations during early programming.

 Of course, this would be completely unworkable outside of dp/dt space, as in
 object space, this would probably exhaust a computer's memory before
 completing.

 Does this get the Loosemore Certificate of No Objection as being an
 apparently workable method for substantially optimal unsupervised learning?

 Thanks for considering this.

 Steve Richfield


 
 agi | Archives | Modify Your Subscription



-- 
Abram Demski
Public address: abram-dem...@googlegroups.com
Public archive: http://groups.google.com/group/abram-demski
Private address: abramdem...@gmail.com


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b

Re: [agi] Introducing Steve's Theory of Everything in cognition.

2008-12-28 Thread Abram Demski

Steve,

There has been plenty of speculation regarding just WHAT is buried in those
principal components.
Do they generally comprise simple combinations if identifiable features, or
some sort of smushing
that virtually encrypts the features? I have heard arguments on both sides
of this issue.
Can anyone here shine some light on this?

It seems like this gets back to the ill-defined problem again. There
is no way of answering without more information about what the output
of PCA is to be used for! The only immediate criteria we have is how
good of a probabilistic model an algorithm finds.

There are, of course, many models to explain any finite set of data.
Some like PCA may do so more concisely, while others may do so in ways
that better lend themselves to subsequent computations,
presumably to direct future actions.

Conciseness will tend to be better for computation, simply because it
is computationally easier to manipulate less data... of course, this
is no guarantee, since the data may need to be fully uncompressed to
extract a different feature, and if the compression is lossy then the
feature may no longer be available. But if we know what we want to
compute, we should be using supervised learning methods. Good
predictive models will be likely to help us regardless of our goal.

Unrestrained, PCA could change a neuron's functionality based on new data,
and very likely wreck
a functioning NN's future operation by doing so.

Learning can still eventually converge. Also, I want to note that this
is in some respects a quirk of NN methods that goes away if you think
of things symbolically.

--Abram

On Sun, Dec 28, 2008 at 2:36 AM, Steve Richfield
steve.richfi...@gmail.com wrote:
Abram,

On 12/27/08, Abram Demski abramdem...@gmail.com wrote:

Steve,

My thinking in the significant figures issue is that the purpose of
unsupervised learning is to find a probabilistic model of the data

(whereas the purpose of supervised learning is to find a probabilistic
model of *one* variable *conditioned on* all the others). When you
talk about the insufficiency of standard PCA, do you think the
problems you refer to relate to

(1) PCA finding a suboptimal model, or

If the features can be extracted from combinations of components, then PCA
is arguably optimal.
If not, then PCA is probably not what is needed.

Genuine PCA has some other unrelated problems, in that it is VERY
computationally intensive,
and there isn't (yet) any good incremental PCA algorithm that learns
somewhat like you would
expect a neuron to learn. I suspect that I may also have to crack this nut
before dp/dt
becomes truly useful.

(2) the optimal model being not quite what you are after?

I, like everyone else, want to use an optimal model. However, my idea of
optimality may be different
than other people's idea of optimality, as we seek to optimize different
things.
Unrestrained, PCA could change a neuron's functionality based on new data,
and very likely wreck
a functioning NN's future operation by doing so.

I suspect that some additional cleverness is needed, e.g. neurons initially
being in a discovery mode
that produces no output until a principal component (or something like a
principal component)
is discovered. Then, when downstream neurons use that principal component,
subsequent alteration
would be constrained to refining that component, with no possibility of
completely abandoning it
for a completely different component that might better represent the input.

Any thoughts?

Steve Richfield
===
On Sat, Dec 27, 2008 at 3:05 AM, Steve Richfield
steve.richfi...@gmail.com wrote:
Abram,

On 12/26/08, Abram Demski abramdem...@gmail.com wrote:

Steve,

When I made the statement about Fourier I was thinking of JPEG
encoding. A little digging found this book, which presents a unified
approach to (low-level) computer vision based on the Fourier
transform:

http://books.google.com/books?id=1wJuTMbNT0MCdq=fourier+visionprintsec=frontcoversource=blots=3ogSJ2i5uWsig=ZdvvWvu82q8UX1c5Abq6hWvgZCYhl=ensa=Xoi=book_resultresnum=2ct=result#PPA4,M

Interesting, but seems far removed from wet neuronal functionality,
unsupervised learning, etc.

But that is beside the present point. :)

Probably so. I noticed that you recently graduated, so I thought that I
would drop that

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

2008-12-28 Thread Mike Tintner

Robert,

What kind of problems have you designed this to solve? Can you give some 
examples?
  Robert:

A brief paper on an AGI system for human-level  ...had only 2 pages to 
fit in.

If you are working on a system, you probably hope it will one day help 
design a better world, better tools, better inventions.  The better is a 
subjective human value.  A place for or human-like representation of  at least 
rough, general human values  (bias, likes) in the AGI is essential.

The paper give a quick view of the Human-centric representation and 
behavioral systems approach for problem-solving, reasoning as giving meaning 
(human values) to stories and games...Indexing relations via spatially related 
registers is it's simulated substrate.

Happy Holidays,
Robert

...all the human values were biased, unlike the very objective AGI 
systems designed on the Mudfish's home planet; AGI systems that objectively 
knew that sticky mud is beautiful,  large oceans of gooey mud..how enchanting!  
Pure clean water, now that's fishy!
   



---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-28 Thread Philip Hunt

2008/12/27 Matt Mahoney matmaho...@yahoo.com:
 --- On Fri, 12/26/08, Philip Hunt cabala...@googlemail.com wrote:

  Humans are very good at predicting sequences of
  symbols, e.g. the next word in a text stream.

 Why not have that as your problem domain, instead of text
 compression?

 That's the same thing, isn't it?

Yes and no. What i mean is they may be the same in principle, but I
don't think they are in practice.

I'll illustrate this by way of an analogy. The Turing Test is
considered by many to be a reasonable definition of intelligence. And
I'd agree with them -- if a computer can fool sophisticated alert
people into thinking it's a human, it's probably at least as clever as
a human. Now consider the Loebner Prize. IMO this is a waste of time
in terms of advancement of AI because we're not anyway near advanced
enough to build a machine that can think as well as a human. So
programs that are good at the Loebner prize as so not because they
have good AI architectures, but because threy employ clever tricks to
fool people. But that's all there is -- clever tricks with no real
substance.

Consider compression programs. I have several on my computer: zip,
compress, bzip2, gzip, etc. These are all quite good at compression
(they all seem to work well on Python source code, for example), but
there is not real intelligence or understanding behind them -- they
are clever tricks with no substance (where by substance I mean
intelligence).

Now, consider if I build a program that can predict how some sequences
will continue. For example, given

   ABACADAEA

it'll predict the next letter is F, or given:

  1 2 4 8 16 32

it'll predict the next number is 64. (Whether the program works on
bits, bytes, or longer chunks is a detail, though it might be an
important detail.)

Even though the program is good at certain types of sequences, it
doesn't do compression. For it to do so, I'd have to give it some
notation to build a compressed file and then uncompress it again. This
is a lot of tedious detail work and doesn't add to it's intelligence.
IMO it would just get in the way.

-- 
Philip Hunt, cabala...@googlemail.com
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-28 Thread Philip Hunt

2008/12/28 Philip Hunt cabala...@googlemail.com:

 Now, consider if I build a program that can predict how some sequences
 will continue. For example, given

   ABACADAEA

 it'll predict the next letter is F, or given:

  1 2 4 8 16 32

 it'll predict the next number is 64. (Whether the program works on
 bits, bytes, or longer chunks is a detail, though it might be an
 important detail.)

 Even though the program is good at certain types of sequences, it
 doesn't do compression. For it to do so, I'd have to give it some
 notation to build a compressed file and then uncompress it again. This
 is a lot of tedious detail work and doesn't add to it's intelligence.
 IMO it would just get in the way.

Furthermore, I don't see that a sequence-predictor should necessarily
attempt to guess the next in the sequence by attempting to generate
thre shortest possible Turing machine capable of producing the
sequence (certainly humans don't work that way). If sequence-predictor
uses this method and is good at predictinbg sequences, good; but if it
uses anotherm ethod and is good at predicting sequences, it's just as
good.

What matters is a program's performance, not how it does it.

-- 
Philip Hunt, cabala...@googlemail.com
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

2008-12-28 Thread Robert Swaine

Mike,
 
Mike wrote:
What kind of problems have you designed this to solve? Can you give some 
examples?
 
Natural language understanding, path finding, game playing
 
Any problems that can be represented as a situation in the four component 
domain (value - role - relation -  feature models) can be 3-C (compared, 
contrast, combined) to give a resulting situation (frame pattern).  What is 
combined/compared/or contrast?:  only the regions under attention, including 
its focus detail level are examined.  What is placed and represented in the 
regions determines what component can be 3-C analyzed... as a general computing 
paradigm using 3-C (AND - OR - NOT).
 
 
Example:
Here's a pattern example you may not have seen before, but by 3C you discover 
the pattern and how to make an example:
 
As spoken aloud:
five and nine    [is]   fine
two and six [is]   twix
five and seven  [is]   fiven
 
Take the five and seven = fiven.
when the system compares the resultant of fiven to five ..the result is 
that five is at the start of the situation.
When it compares fiven and seven... the result is that ven is at the end 
position.
 
resulting situation PATTERN = 
[situation 1 ][ focus inward ] [ start-position ]    combined with 
[situation 2 ][ focus inward ] [ end position  ]
(Spatial and sequence positions are a key part of the representation system)
 
How was the correct (reasoning) method chosen?
This result was was by comparison; it could have been by contrasting.  All 
three Compare, Contrast and Combine happen symultaneously.  The winner is 
whichever resulting situation makes sense to the system has the most activation 
in the value area (some direct or indirect value from past experience or value 
given by the authority system in the value region: e.g. fearful or attractive 
spectrum).
 
How was the correct region and focus detail level chosen?
The attention region in the example was on the sound region, the focus detail 
was on the phoneme level (syllable), it could have looked for patterns in the 
number values or the emotions related to each word, or the letter patterns,  or 
hand motions, eye position when spoken, etc).  The regions are biased by the 
value system's current index (amygdala/septum analog): e.g. when you see five 
the quantity region will be given a lower threshold, and the focus level 
associated will give the content on the 1 - 10 scale. The index region weights 
are re-organized only by stronger reward/failure (authority system), 3-C 
results can on the index changing the content connections weights.
 
Now you compare apples to oranges for an encore; what do you get? a color, a 
taste, a mass, a new fruit..your attention determines te result
 
All regions are being matched for patterns in the 2 primary index modules 
(action selection, emotional value,..others can be integrated seamlessly).
 
Five and seven is not fiven, it is twelve, but in this situation it makes 
sense to the circumstances.  Sense and meaning are contextual for the model, 
for humans.
 
 
Hope this sheds light. Detailed paper has been in the works.
Robert
 
--- On Sun, 12/28/08, Mike Tintner tint...@blueyonder.co.uk wrote:

From: Mike Tintner tint...@blueyonder.co.uk
Subject: Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] 
AGI Preschool: sketch of an evaluation framework for early stage AGI systems 
aimed at human-level, roughly humanlike AGI
To: agi@v2.listbox.com
Date: Sunday, December 28, 2008, 4:49 PM

Robert,
 
What kind of problems have you designed this to solve? Can you give some 
examples?
Robert:
 
A brief paper on an AGI system for human-level  ...had only 2 pages to fit in.
 
If you are working on a system, you probably hope it will one day help design a 
better world, better tools, better inventions.  The better is a subjective 
human value.  A place for or human-like representation of  at least rough, 
general human values  (bias, likes) in the AGI is essential.
 
The paper give a quick view of the Human-centric representation and behavioral 
systems approach for problem-solving, reasoning as giving meaning (human 
values) to stories and games...Indexing relations via spatially related 
registers is it's simulated substrate.
 
Happy Holidays,
Robert
 
...all the human values were biased, unlike the very objective AGI systems 
designed on the Mudfish's home planet; AGI systems that objectively knew that 
sticky mud is beautiful,  large oceans of gooey mud..how enchanting!  Pure 
clean water, now that's fishy!
  


agi | Archives  | Modify Your Subscription  


--- On Sun, 12/28/08, Mike Tintner tint...@blueyonder.co.uk wrote:

From: Mike Tintner tint...@blueyonder.co.uk
Subject: Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] 
AGI Preschool: sketch of an evaluation framework for early stage AGI systems 
aimed at human-level, roughly humanlike AGI
To: agi@v2.listbox.com
Date:

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

2008-12-28 Thread Mike Tintner

Robert:
Example:
Here's a pattern example you may not have seen before, but by 3C you discover 
the pattern and how to make an example:

As spoken aloud:
five and nine[is]   fine
two and six [is]   twix
five and seven  [is]   fiven


Robert,

So, if I understand, you're designing a system to deal with problems concerning 
objects, which have multiple domain associations.  For example, words as above 
are associated with their sounds, letter patterns, and perhaps meanings. But 
the system always *knows* these domains beforehand  - and that it must consider 
them in any problem?

It couldn't say find the pattern to a problem like:

Six  2003
Seven  1996
Eight 2001
Eight and a half   ? 

where it wouldn't know any domain relevant to solving the problem, and would 
first have to *find* the appropriate domain?. (In creative, human-level 
intelligence problems you often have to do this).


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Universal intelligence test benchmark

2008-12-28 Thread Philip Hunt

2008/12/29 Matt Mahoney matmaho...@yahoo.com:

 Please remember that I am not proposing compression as a solution to the AGI 
 problem. I am proposing it as a measure of progress in an important component 
 (prediction).

Then why not cut out the middleman and measure prediction directly?
I.e. put the prediction program in a test harness, feed it chunks one
at a time, ask it what the next value in the sequence will be, tell it
what the actual answer was, etc. The program's score is then simply
the number it got right divided by the number of predictions it had to
make.

Turning a prediction program into a compression program requires
superfluous extra work: you have to invent an efficient file format to
hold compressed data, and you have to write a decompression program as
well as a compressor.

Furthermore there are bound to be programs that're good at compression
but not good at prediction. Whereas all programs that're good at
prediction are guaranteed to be good at prediction.

-- 
Philip Hunt, cabala...@googlemail.com
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

2008-12-28 Thread Robert Swaine

Mike,
Very good choice.
 
 But the system always *knows* these domains beforehand  - and that it must 
 consider them in any problem?
 
 
YES  the domains content structure is what you mean, are the human-centric 
ones provided by living a childs life loading the value system with biases such 
as humans are warm and candy is really sweet.  By further being pushed thru 
western culture grade level curriculum we value the visual features symbols 
2003 and 1996 as numbers, then as dates.  The content models (concept 
patterns) are build up from any basic feature to form instance from the basic 
content of the four domain, such as dates of leap years, century marks, 
millenium or anniversary.
 
problems more like:
  --  ice cream favorite red happee  -- 
What this group of words means has everything to do with what the reader knows 
and values beforehands.  And what he values will determine what his attention 
is on, the food, the emotions, the color, the positions; or how deep the focus 
is: on the entire situation (sentence), a group of them, a single word or a 
letter.  Humans value from the top so we'll likely think of cherry ice cream 
before we see:
    the occurance pattern of letter e in every word in that 'sentence' above.
 
 
 
Good choice for your problem: 
Six  2003
Seven  1996
Eight 2001
Eight and a half   ?  (i see a number of patterns, such as 00 99, 
multiply, add word to end - but haven't gotten the complete formula)
 
For the system, it is biased; it make sense for itself, it's internal value.
 
The answer the system chooses is the one that makes sense to what it knows and 
values.  Sure, it can and will be used as a general pattern mining by comparing 
and contrasting within lines, line-to-line, number-to-text, text-to-number, 
date-to-word, month-to-number, middle-part to end, end-to-end, etc, until a 
resulting comparison yeilds a pattern that it values (from experience or being 
told).  However, the value system controlling attention prevents any 
combinatorial explosion - animals only search through the models that have 
value (indirectly or directly) to the problem situation, thus limiting the 
total gueses we could even make (it looks for patterns it already knows).
 
To solve problems it has not been taught or can't see a pattern for:
 
1) If self-motivated because a reward/avoidance is strong:
Keeps looking for patterns 3-C by persiting in its behavior (doing the same ol 
thing) and fail.
If a value happens to occur in one of the result when it kept going, it will 
see that something was different.  It has acces to its own actions (role and 
relation domain) and this different action stands out (auto-contrast) and 
become of greater value due to the associated difference (non-failure). It 
keeps trying until the motivation runs out (energy level decays) or other value 
or past experiences exceeds its model of how long it should take..
 
2)  Instructed how to solve it by trying x, y or x.
 Wden your attention,  expand your focus - then it has a larger set of 
regions to try and find a pattern it values.  If set, it can examine regions of 
the instruction (x, y , and z) and see what was different from what it was 
trying (if the comparision yeilds a high enough value, it will try those as 
well). Try going left and up O.K.  auto-contrast I was trying only up: the 
difference is to add one more direction; I can try left and up and back etc..
 
 

Creativity and reason come from the  3-C mechanism
 
Creativity in the model is to combine any sets of domain content and give it a 
respective value from its experience and domain models. 
 
Example: Combine the form of a computer mouse, the look of diamonds, the 
function of a steering wheel, with the feel of leather: what do you get?  Focus 
on each region and combine, then e-valuate (compare it to objects, functions).  
What's your result?
Models in my experience say that it's a luxury-car controller; while you might 
say it would be something in an art galleryy, etc (art, value without 
function/role).
 
 
Anyway, Bens, pre-school for AGI is one of the means to bias such a system with 
experience and human values; another way is to try to properly represent human 
experience (static and dynamic) and then essentially implanting memories and 
experience instead of just declarative facts.
 
Robert
 

--- On Sun, 12/28/08, Mike Tintner tint...@blueyonder.co.uk wrote:

From: Mike Tintner tint...@blueyonder.co.uk
Subject: Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] 
AGI Preschool: sketch of an evaluation framework for early stage AGI systems 
aimed at human-level, roughly humanlike AGI
To: agi@v2.listbox.com
Date: Sunday, December 28, 2008, 8:38 PM





Robert: 
Example:
Here's a pattern example you may not have seen before, but by 3C you discover 
the pattern and how to make an example:
 
As spoken aloud:
five and nine    [is]   fine
two and six [is]   twix
five and seven  [is]   fiven
 
 
Robert,
 
So, if

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

2008-12-28 Thread Mike Tintner

Robert,

Thanks for your detailed, helpful replies. I like your approach of operating in 
multiple domains for problemsolving. But if the domains are known beforehand, 
then it's not truly creative problemsolving - where you do have to be prepared 
to go in search of the appropriate domains - and thus truly cross domains 
rather than simply combining preselected ones. I gave you a perhaps exaggerated 
example just to make the point. You had to realise that the correct domain to 
solve my problem was that of movies - the numbers were the titles of movies and 
the dates they came out. If you're dealing with real world rather than just 
artificial creative problems like our two, you may definitely have to make that 
kind of domain switch - solving any scientific detective problem, say, like 
that of binding in the brain, may require you to think in a surprising, new 
domain, for which you will have to search long and hard (and possibly without 
end).

Mike,
Very good choice.

 But the system always *knows* these domains beforehand  - and that it 
must 
 consider them in any problem?


YES  the domains content structure is what you mean, are the 
human-centric ones provided by living a childs life loading the value system 
with biases such as humans are warm and candy is really sweet.  By further 
being pushed thru western culture grade level curriculum we value the visual 
features symbols 2003 and 1996 as numbers, then as dates.  The content 
models (concept patterns) are build up from any basic feature to form instance 
from the basic content of the four domain, such as dates of leap years, 
century marks, millenium or anniversary.

problems more like:
  --  ice cream favorite red happee  -- 
What this group of words means has everything to do with what the 
reader knows and values beforehands.  And what he values will determine what 
his attention is on, the food, the emotions, the color, the positions; or how 
deep the focus is: on the entire situation (sentence), a group of them, a 
single word or a letter.  Humans value from the top so we'll likely think of 
cherry ice cream before we see:
the occurance pattern of letter e in every word in that 
'sentence' above.



Good choice for your problem: 
Six  2003
Seven  1996
Eight 2001
Eight and a half   ?  (i see a number of patterns, such as 00 
99, multiply, add word to end - but haven't gotten the complete formula)

For the system, it is biased; it make sense for itself, it's internal 
value.

The answer the system chooses is the one that makes sense to what it 
knows and values.  Sure, it can and will be used as a general pattern mining by 
comparing and contrasting within lines, line-to-line, number-to-text, 
text-to-number, date-to-word, month-to-number, middle-part to end, end-to-end, 
etc, until a resulting comparison yeilds a pattern that it values (from 
experience or being told).  However, the value system controlling attention 
prevents any combinatorial explosion - animals only search through the models 
that have value (indirectly or directly) to the problem situation, thus 
limiting the total gueses we could even make (it looks for patterns it already 
knows).

To solve problems it has not been taught or can't see a pattern for:

1) If self-motivated because a reward/avoidance is strong:
Keeps looking for patterns 3-C by persiting in its behavior (doing the 
same ol thing) and fail.
If a value happens to occur in one of the result when it kept going, it 
will see that something was different.  It has acces to its own actions (role 
and relation domain) and this different action stands out (auto-contrast) and 
become of greater value due to the associated difference (non-failure). It 
keeps trying until the motivation runs out (energy level decays) or other value 
or past experiences exceeds its model of how long it should take..

2)  Instructed how to solve it by trying x, y or x.
 Wden your attention,  expand your focus - then it has a larger set 
of regions to try and find a pattern it values.  If set, it can examine regions 
of the instruction (x, y , and z) and see what was different from what it was 
trying (if the comparision yeilds a high enough value, it will try those as 
well). Try going left and up O.K.  auto-contrast I was trying only up: the 
difference is to add one more direction; I can try left and up and back etc..



Creativity and reason come from the  3-C mechanism

Creativity in the model is to combine any sets of domain content and 
give it a respective value from its experience and domain models. 

Example: Combine the form of a computer mouse, the look of diamonds, 
the function of a steering wheel, with the feel of leather: what do you get?  
Focus on each region and combine, then e-valuate (compare it to

Re: [agi] Introducing Steve's Theory of Everything in cognition.

2008-12-28 Thread Abram Demski

Steve,

I should have specified further. When I say good I mean good at
predicting. PCA attempts to isolate components that give maximum
information... so my question to you becomes, do you think that the
problem you're pointing towards is suboptimal models that don't
predict the data well enough, or models that predict the data fine but
aren't directly useful for what you expect them to be useful for?

To that end... you weren't talking about using the *predictions* of
the PCA model, but rather the principle components themselves. The
components are essentially hidden variables to make the model run. I'm
thinking there are two reasons to examine them: either in hopes of a
coincidental direct link between the hidden variable and the goal (an
expedient that would make the calculation of predictions unnecessary),
or in an attempt to complexify the model to make it more accurate in
its predictions, by looking for links between the hidden variables, or
patterns over time, et cetera.

--Abram

On Sun, Dec 28, 2008 at 11:13 PM, Steve Richfield
steve.richfi...@gmail.com wrote:
 Abram,

 On 12/28/08, Abram Demski abramdem...@gmail.com wrote:

 Steve,

  There has been plenty of speculation regarding just WHAT is buried in
  those
  principal components.
  Do they generally comprise simple combinations if identifiable features,
  or
  some sort of smushing
  that virtually encrypts the features? I have heard arguments on both
  sides
  of this issue.
  Can anyone here shine some light on this?

 It seems like this gets back to the ill-defined problem again. There
 is no way of answering without more information about what the output
 of PCA is to be used for!


 Sure we know - it is to be combined with other outputs is a big Bayesian
 fuzzy-logic network to recognize things, devise plans, and execute them (but
 hopefully not us included).

 The only immediate criteria we have is how
 good of a probabilistic model an algorithm finds.


 Presumably, good means suitable for the above purpose.

  There are, of course, many models to explain any finite set of data.
  Some like PCA may do so more concisely, while others may do so in ways
  that better lend themselves to subsequent computations,
  presumably to direct future actions.

 Conciseness will tend to be better for computation, simply because it
 is computationally easier to manipulate less data... of course, this
 is no guarantee, since the data may need to be fully uncompressed to
 extract a different feature, and if the compression is lossy then the
 feature may no longer be available. But if we know what we want to
 compute, we should be using supervised learning methods.


 At the risk of wasting a few neurons, unsupervised methods may work just as
 well - or even better since a layer of neurons can be completely finished
 before there is enough of subsequent layers put together for supervision to
 even work.

 Good
 predictive models will be likely to help us regardless of our goal.


 Of course, the entire discussion centers around good above.

  Unrestrained, PCA could change a neuron's functionality based on new
  data,
  and very likely wreck
  a functioning NN's future operation by doing so.

 Learning can still eventually converge. Also, I want to note that this
 is in some respects a quirk of NN methods that goes away if you think
 of things symbolically.


 No, I think this is a quirk of supervised learning, which people who write
 symbols usually avoid. Note the proposal on this thread that I just directed
 toward Loosemore, which also avoids this problem in a NN structure.

 What I think most people are missing is that done right, UNsupervised
 learning is WAY faster than supervised learning.

 Steve Richfield
 ===

 On Sun, Dec 28, 2008 at 2:36 AM, Steve Richfield
 steve.richfi...@gmail.com wrote:
 Abram,

 On 12/27/08, Abram Demski abramdem...@gmail.com wrote:

 Steve,

 My thinking in the significant figures issue is that the purpose of
 unsupervised learning is to find a probabilistic model of the data


 There are, of course, many models to explain any finite set of data.
 Some like PCA may do so more concisely, while others may do so in ways
 that better lend themselves to subsequent computations,
 presumably to direct future actions.


 (whereas the purpose of supervised learning is to find a probabilistic
 model of *one* variable *conditioned on* all the others). When you
 talk about the insufficiency of standard PCA, do you think the
 problems you refer to relate to

 (1) PCA finding a suboptimal model, or


 There has been plenty of speculation regarding just WHAT is buried in
 those
 principal components.
 Do they generally comprise simple combinations if identifiable features,
 or
 some sort of smushing
 that virtually encrypts the features? I have heard arguments on both sides
 of this issue.
 Can anyone here shine some light on this?



 If the features can be extracted from combinations of components, then PCA
 is

Re: [agi] Universal intelligence test benchmark

2008-12-28 Thread Philip Hunt

2008/12/29 Philip Hunt cabala...@googlemail.com:
 2008/12/29 Matt Mahoney matmaho...@yahoo.com:

 Please remember that I am not proposing compression as a solution to the AGI 
 problem. I am proposing it as a measure of progress in an important 
 component (prediction).

[...]
 Turning a prediction program into a compression program requires
 superfluous extra work: you have to invent an efficient file format to
 hold compressed data, and you have to write a decompression program as
 well as a compressor.

Incidently, reading Matt's posts got me interested in writing a
compression program using Markov-chain prediction. The prediction bit
was a piece of piss to write; the compression code is proving
considerably more difficult.

-- 
Philip Hunt, cabala...@googlemail.com
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html


---
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Introducing Steve's Theory of Everything in cognition.

[agi] Alternative Cicuitry

RE: [agi] Universal intelligence test benchmark

Re: [agi] Alternative Cicuitry

RE: [agi] Universal intelligence test benchmark

Re: [agi] Introducing Steve's Theory of Everything in cognition.

Re: [agi] Introducing Steve's Theory of Everything in cognition.

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

Re: [agi] Universal intelligence test benchmark

Re: [agi] Universal intelligence test benchmark

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

Re: [agi] Universal intelligence test benchmark

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

Re: Human-centric AGI approach-paper (was Re: Indexing and Re: [agi] AGI Preschool: sketch of an evaluation framework for early stage AGI systems aimed at human-level, roughly humanlike AGI

Re: [agi] Introducing Steve's Theory of Everything in cognition.

Re: [agi] Universal intelligence test benchmark

17 matches

Site Navigation

Mail list logo

Footer information