>From one perspective, the objective program is an approximation to the
argmax function, that is, it takes as input an evaluation function and
tries to output the data structure (which in some cases will itself be
a program) with the highest evaluation. (From this perspective AGI is
hard because argmax is NP-hard, which is why we can only approximate
it.) Such a program would not by itself have human level intelligence,
e.g. it would not pass the Turing test, but it could be useful in a
wide range of technical domains: input a propositional formula, get
(in some cases) a set of variable assignments that satisfies the
formula; input a program and a set of security criteria, get (in some
cases) a list of security flaws; input a set of labelled photographs,
get a program that can classify photographs (hopefully including ones
that weren't in the training set).

This is a useful way to look at it because argmax applies at many
levels of decomposition. For example, the search for security flaws in
existing software will involve reasoning about the code, which will
require proving theorems, itself a problem of search i.e. argmax. So
when the system has a theorem to prove, it recursively calls another
copy of the AGI program (at least conceptually; the obvious
implementation of this would have overhead, much of which could
hopefully be removed in an optimized implementation); within that
proof search, the same process of recursion could be used for
intermediate lemmas (cf. the DPLL algorithm for propositional
satisfiability). In the opposite direction, when the system needs to
improve its performance at a task such as theorem proving, the search
for better heuristics for this purpose is itself an argmax problem to
which the AGI can be applied, as is the search for better
meta-heuristics for that search.

Obviously we are going to want to apply some kind of memoization such
that when the same problem is encountered repeatedly, the answer need
not be recomputed every time. Should this be a hardwired framework
feature or a consequence of the learning algorithm? The answer to this
is complicated by the question of what constitutes the same problem.
Certainly the system needs to see through differences in variable
names. It might also need to see through other non-differences like
A+B vs B+A and A+0 vs A.

The body of the program should be written in a purely functional
language. One reason is that this encourages the search procedure to
develop reusable building blocks. Another reason, more specific to the
proposed architecture, relates to the recursive search structure. At
each argmax node, as we calculate better approximations - find better
answers - these can be propagated upwards without having to rerun the
entire global computation from scratch.

That leads on to the next question: how do we allocate CPU time? The
recursive search structure offers a starting point: each argmax node
can be a process, continually searching for better answers to send to
its callers. Of course there will be too many of these to make each
one a process at the operating system level let alone give each its
own CPU core, so now we need to decide how to allocate a small number
of CPU cores to many candidate processes. In the worst case this could
be regarded as a problem requiring general computation in its own
right, but our list of open problems is tediously long as it stands;
is there anything more definite we can say about this one?

Perhaps there is. There are general arguments that the expected
utility of producing some result should be (something isomorphic to) a
real number. Further, we can argue that if node A calls B as a
subgoal, the utility of B can't exceed that of A. (If a node has
multiple callers, its utility is bounded by the sum of caller
utilities.)

By itself the above reasoning doesn't buy very much; we could still
have a scenario where A with a utility of 1 calls B and C giving them
both a utility of 1 (maybe both are essential) and so on all the way
down. But once we start thinking about costs we can argue that if the
amount of resources it's worth spending on A is 1, that must be the
amount it's worth spending on A and all its subgoals; so we might end
up with A reserving 0.2 units of resources for its own internal
computation and giving 0.4 units each to B and C; then we can spend
CPU time on each node in proportion to its resource allocation. Now we
have something like a usable specification. (cf. e.g. Holland's bucket
brigade algorithm.)

Memory is trickier than CPU time; more on this later.

Comments so far?


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-c97d2393
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-2484a968
Powered by Listbox: http://www.listbox.com

Reply via email to