RE: [agi] Discovering the Capacity of Human Memory

2003-09-16 Thread James Rogers
 
 Their conclusion is based on the assumptions that there are 
 10^11 neurons and their average synapses number is 10^3. 
 Therefore the total potential relational combinations is 
 (10^11)! / (10^3)! ((10^11)! - (10^3)!), which is 
 approximately 10^8432.
 
 The model is obviously an oversimplification, and the number 
 is way too big.


I was wondering about that.  It seems that the number represents the size of the
phase space, when a more useful metric would be the size (Kolmogorov complexity)
of the average point *in* the phase space.  There is a world of difference
between the number of patterns that can be encoded and the size of the biggest
pattern that can be encoded; the former isn't terribly important, but the latter
is very important.

Cheers,

-James Rogers
 [EMAIL PROTECTED]



---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: [agi] Discovering the Capacity of Human Memory

2003-09-16 Thread James Rogers
Eliezer wrote:
 Are you talking about the average point in the phase space in the sense 
 of an average empirical human brain, or in the sense of a randomly
 selected point in the phase space?  I assume you mean the former, since, 
 for the latter question, if you have a simple program P that 
 produces a phase space of size 2^X, the average size of a random point 
 in the phase space must be roughly X (plus the size of P?) according to 
 both Shannon and Kolmogorov.


Arrgh...  What you said.  My post was sloppy, and I stated it really badly.  

I'm literally doing about 5-way multitasking today, all important things that
demand my attention.  It seems that my email time-slice is under-performing
under the circumstances.

Cheers,

-James Rogers
 [EMAIL PROTECTED]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Web Consciousness and self consciousness

2003-09-06 Thread James Rogers
On 9/6/03 2:34 AM, arnoud [EMAIL PROTECTED] wrote:
 
 If you look at consciousness scientifically you would like an operational
 (measurement of behaviour) definition. In that case two can be
 given(sketched): 1 consciousness is attention and control. 2 consciousness is
 reporting of internal mental events.
 An agent can have consciousness-1 without consciousness-2, but not vv. Most
 (all?) animals are conscious-1, but only(?) humans are conscious-2. Of course
 to be conscious-2 the ability for language is needed, and a basic model of
 how the mind works (e.g. the intentional stance).


I would define consciousness more simply as being able to measure the impact
of your existence on those things you observe.  Think of it as a purely
inferred self-existence where you never observe yourself but much of your
sensory input suggests that the you entity exists wherever you are
indirectly.

The constant inference of an entity's existence would create a very thorough
model of an entity that is never observed but which always apparently is.
The model is exquisitely detailed because you are constantly bombarded with
indirect evidence of its existence. I would say that consciousness is at its
essence a purely inferred self-model, which naturally requires a fairly
large machine to support the model.

Cheers,

-James Rogers
 [EMAIL PROTECTED]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: [agi] Educating Novababies

2003-07-14 Thread James Rogers
Mike Deering wrote:
 Assuming low level feature extraction is hardcoded like edge detection,
motion,
 and depth, then the first thing an intelligence would need to learn is 
 correlation between objects in different sensory streams.  The object that I
see  moving across my field of vision is the same object that is the source of
that 
 noise I hear changing pitch and loudness differentially between my two ears
and 
 is the same object I feel when I reach out and grab it and the same object
that 
 tastes so sour when I put it in my mouth.  For this to take place you need 
 multiple high information sensory streams.  


The reality of the human mind is more complicated.  We hear what we see and
vice versa even when there is no correlated sensory input in reality.  For
example, humans have very little ability to resolve whether a stationary sound
is coming from infront of them or behind them.  Most people THINK they can, but
it is actually their brain filling in the blanks from visual cues, something
that is easy to demonstrate in a controlled test environment.  By the same
token, a fair portion of your apparent visual depth and spatial perception is
augmented by your hearing (integrated surface reflections).  Similar things
occur with all our senses.

All of our senses appear richer than they actually are because your brain
synthesizes data from your other senses into the reality of any particular
sense.  Not only does it mix in relevant data, it also tends to internally
generate data that it infers *should* exist.  One of the many reasons eyewitness
testimony tends to have a mediocre track record for mapping to reality.

Cheers,

-James Rogers
 [EMAIL PROTECTED]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


RE: [agi] doubling time watcher.

2003-02-18 Thread James Rogers
On Tue, 2003-02-18 at 10:48, Ben Goertzel wrote:
  
 A completely unknown genius at the University of Outer Kirgizia could
 band together with his grad students and create an AGI in 5 years,
 then release it on the shocked world.


Ack!  I thought this was a secret!

Curses, foiled again...


-James Rogers
 [EMAIL PROTECTED]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Go and the translation problem.

2003-01-25 Thread James Rogers
On 1/25/03 1:05 AM, Alan Grimes [EMAIL PROTECTED] wrote:
 
 If I say too much more the rabbi will take what I've written, implement
 it himself, and screw up the universe with Yahweh 2.0. I don't want
 that. 
 
 Instead I will, here, present a slightly obfuscated version that,
 hopefully, won't give him the Big Insight(tm)


Well, I wouldn't worry about this too much.  As far as I can tell,
everything you've written so far on this is old hat conceptually, so I'm
pretty sure Eliezer won't suddenly be blessed with a lightbulb over his head
that he didn't have before.

And I didn't even really see a problem implied or asked in your post that I
don't have a good solution for, so it can certainly be done without too much
effort.  That said, I generally agree with your post and I have been
attacking the problem of Go pretty much in this way.  Of course, being able
to do easy abstraction manipulation at many levels simultaneously with
imprecise and partial patterns is one of the better tools in my tool chest
so I like to use it.  Old saws about every problem looking like a nail
notwithstanding.

Cheers,

-James Rogers
 [EMAIL PROTECTED]

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] OS and AGI

2003-01-10 Thread James Rogers
On Fri, 2003-01-10 at 16:44, Damien Sullivan wrote:
 
 While I'm equally horrified by the idea of someone using DOS as a benchmark,
 there is a difference between 'stump' I can't figure this out and 'stump' I
 haven't learned much about this.


Aye, I think the reaction is more to an apparent unwillingness to learn
about such things.  If I discover I might need to learn about something
important, I head to Google straight away.  I don't want to spend time
learning a lot of things I probably should learn, but the universe
doesn't care what I want and so I get right to it.  Seems rational to
me.

I do find Alan's abject incredulity that anyone would want to use Unix
humorous though. :-)

Cheers,

-James Rogers
 [EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Moore's law data - defining HEC

2003-01-06 Thread James Rogers
I recently put together a human brain equivalent model that takes into
consideration several aspects of system performance to figure out what
kind of system configuration we would need to generate a human
equivalent structure (which I expect would actually be much smarter than
a human in practice).  For the purposes of real-world projections,
taking MIPS and GB in the abstract is nigh useless because there are a
slew of caveats as to how all these components actually perform in real
systems.

First, we balanced and normalized system memory requirements in terms of
size with instructions per second and memory bandwidth/latency.  For our
architecture/code, we got the following normal core:


1 BIPS 32-bit integer core attached to 10^9 bytes RAM assuming common
memory architectures.  This is an optimum balance of transistor
allocation for us.


This normal core turns out to be 10^-6 human equivalent in our model. If
you compare our normal core to real systems, you find that CPU
performance is substantially outstripping the memory performance we
require.  That said, such a system could be built in a few years simply
by tweaking existing generic cores commonly used for custom systems
(like ARM or MIPS) and connecting scads of them with a low-latency
multi-dimensional interconnect.  Since you could put a dozen of these
cores on a real chip, the trick would be the memory system and
interconnects for each of these cores.

In short, the CPU is almost where we need it to be now, but the memory
is still way behind the curve.  By the time memory catches up so that we
can have human equivalent intelligence, we'll have enough extra CPU that
we'll have human level intelligence that runs much faster than a normal
human.  Which is to say the curve will be more on the fast and stupid
side of the curve than the slow but smart side of the curve if
balanced for existing architectures.

Cheers,


-James Rogers
 [EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] Diminished impact of Moore's Law on AGI due to otherbottlenecks

2003-01-04 Thread James Rogers
On 1/4/03 3:02 PM, Shane Legg [EMAIL PROTECTED] wrote:
 
 I had similar thoughts, but when I did some tests on the webmind code
 a few years back I was a little surprised to find that floating point
 was about as fast as integer math for our application.  This seemed to
 happen because where you could do some calculation quite directly with
 a few floating point operations, you would need more to achieve the same
 result with integer math due to extra normalisation operations etc.


I can see this in some cases, but for us the number of instructions is
literally the same; the data fields in question could swap out floats with
ints (with a simple typedef change) with no consequences.  We do have a
normalization function, but since that effectively prunes things we'd use it
whether it was floating point or integer, and it is only very rarely
triggered anyway.

I guess the key point is that we aren't really faking floating point with
integers.  It is a case of floating point bringing nothing to the table
while offering somewhat inferior performance under certain conditions.  The
nice thing about integers is that performance is portable.  I certainly
wouldn't shy away from using floating point if it made sense.  It is just a
mild curiosity that when all is said and done, nothing in the core engine
requires floating point computation.

 
 I was also surprised to discover that the CPU did double precision
 floating point math at about the same speed as single precision floating
 point math.  I guess it's because a lot of floating point operations are
 internally highly parallel and so extra precision don't make much speed
 difference?


I believe this is because current FP pipelines are double precision all the
way through generally.  If you run single precision code, it uses up just as
many execution pipelines as double precision.

The exception is the SIMD floating point engines (aka multimedia
extensions) that a lot of processors support today.  But I normally just
write all floating point for standard double precision execution these days.

 
 Anyway, the thing that really did affect performance was the data size
 of the numbers being used (whether short, int, long, float, double etc.)
 Because we had quite a few RAM cache misses, using a smaller data type
 effectively meant that we could have twice as many values in cache at
 the same time and each cache miss would bring twice as many new values
 into the cache.  So it was really the memory bandwidth required by the
 size of the data types we were using that slowed things down, not the
 time the CPU took to do a calculation on a double precision floating
 point number compared to say an a simple int.


A good point, and one that applies to using LP64 types as well.  The
entirety of our code fits in cache, but data fetches are unavoidably
expensive.


 I'd always had a bias against using floating point numbers ever
 since I used to write code 15 years ago when the CPU's I used
 weren't designed for it and it really slowed things down badly.
 It's a bit different now however with really fact floating point
 cores in CPUs.


One consideration that HAS gone into maintaining a pure integer code base is
that it can run with extreme efficiency as currently designed on simple
integer MasPar.  When used like this, the opportunity exists for scalability
that is far beyond what we could get if we required a floating point
pipeline.  The idea of having scads of simple integer cores connected to a
small amount of fast memory and low latency messaging interconnects is
appealing and our code is very well suited for this type of architecture.
Fortunately there seems to be companies starting to produce these types of
chips.

Ultimately, we'd like to move the code to something like this, and since
there is no design or performance cost to only using integers on standard
PCs (they work better anyway in our case), we haven't introduced floating
point into the kernel without a good reason.  So far we haven't actually
come across a need for floating point computation in the kernel, so we've
never had to deal with this issue.

Cheers,

-James Rogers
 [EMAIL PROTECTED]


---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/?[EMAIL PROTECTED]



Re: [agi] A point of philosophy, rather than engineering

2002-11-11 Thread James Rogers
On Mon, 2002-11-11 at 14:11, Charles Hixson wrote:
 
 Personally, I believe that the most effective AI will have a core 
 general intelligence, that may be rather primitive, and a huge number of 
 specialized intelligence modules.  The tricky part of this architecture 
 is designing the various modules so that they can communicate.  It isn't 
 clear that this is always reasonable (consider the interfaces between 
 chess and cooking), but if the problem can be handled in a general 
 manner (there's that word again!), then one of the intelligences could 
 be specialized for message passing.  In this model the core general 
 intelligence will be for use when none of the hueristics fit the 
 problem.  And it's attempts will be watched by another module whose 
 specialty is generating new hueristics.


This is essentially what we do, but it works a little differently than
you are suggesting.  The machinery and representation underneath the
modules is identical, where each module is its own machine which has
become optimized for its task.  In other words, if you were making a
module on chess and a module on cooking you would start with the same
blank module machinery and they would be trained for their respective
tasks.

If you looked at the internals of the module machinery after the
training period, you would notice marked macro-level structural
differences between the two that relate to how the machinery
self-optimizes for its task. The computational machines, which are
really just generic Turing virtual machines that you could program any
type of software on, use a pretty foreign notion of
computation/processing -- the processor model looks nothing like a von
Neumann-variant architecture.  Despite notable differences in structure,
it is really just two modules of the same machine that have
automatically conformed structurally to their data environment.

The interesting part is the integration of the modules.  There are
actually a number of ways to do it, all of which have advantages and
disadvantages.  One advantage of having simple underlying machinery
controlling the representation of data is that all modules already
deeply understand the data of any other module.  You COULD do a hard
merge of the cooking module with the chess module into one module, and
automatically discover the relations and abstract similarities between
the two (whatever those might be) without any special code, but there
are lots of reasons why this is bad in practice.  In implementation, we
typically do what we would call a soft merge, where the machines are
fully integrated for most purposes and can use each others space, but
where external data feeds are localized to specific modules within the
cluster (even though these modules have access to every other module for
the purposes of processing the data feed).  From the perspective of
external data streams it looks like a bunch of independent machines
working together, but from the perspective of the machine the entire
cluster is a single machine image.  There are good theoretical reasons
for doing things this way which I won't go into here.

In short, we mostly do what you are talking about, but you've actually
over-estimated the difficulty of integration of domain-specific modules
(using our architecture, at least).  Actually building modules is more
difficult, mostly because the computing architecture uses assumptions
that are very strange; I think my programmer's mind works against me
some days, and teaching/training modules by example is easier than
programming them directly most times.  Once they are done, you pretty
much can do plug-n-play on-the-fly integration, even on a hot/active
cluster of modules  (resource permitting, of course).  An analogy would
be how they learned new skills on-the-fly in The Matrix.  The
integration is a freebie that comes with the underlying architecture,
not something that I spent much effort designing.

Cheers,

-James Rogers
 [EMAIL PROTECTED]  

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/