>From one perspective, the objective program is an approximation to the argmax function, that is, it takes as input an evaluation function and tries to output the data structure (which in some cases will itself be a program) with the highest evaluation. (From this perspective AGI is hard because argmax is NP-hard, which is why we can only approximate it.) Such a program would not by itself have human level intelligence, e.g. it would not pass the Turing test, but it could be useful in a wide range of technical domains: input a propositional formula, get (in some cases) a set of variable assignments that satisfies the formula; input a program and a set of security criteria, get (in some cases) a list of security flaws; input a set of labelled photographs, get a program that can classify photographs (hopefully including ones that weren't in the training set).
This is a useful way to look at it because argmax applies at many levels of decomposition. For example, the search for security flaws in existing software will involve reasoning about the code, which will require proving theorems, itself a problem of search i.e. argmax. So when the system has a theorem to prove, it recursively calls another copy of the AGI program (at least conceptually; the obvious implementation of this would have overhead, much of which could hopefully be removed in an optimized implementation); within that proof search, the same process of recursion could be used for intermediate lemmas (cf. the DPLL algorithm for propositional satisfiability). In the opposite direction, when the system needs to improve its performance at a task such as theorem proving, the search for better heuristics for this purpose is itself an argmax problem to which the AGI can be applied, as is the search for better meta-heuristics for that search. Obviously we are going to want to apply some kind of memoization such that when the same problem is encountered repeatedly, the answer need not be recomputed every time. Should this be a hardwired framework feature or a consequence of the learning algorithm? The answer to this is complicated by the question of what constitutes the same problem. Certainly the system needs to see through differences in variable names. It might also need to see through other non-differences like A+B vs B+A and A+0 vs A. The body of the program should be written in a purely functional language. One reason is that this encourages the search procedure to develop reusable building blocks. Another reason, more specific to the proposed architecture, relates to the recursive search structure. At each argmax node, as we calculate better approximations - find better answers - these can be propagated upwards without having to rerun the entire global computation from scratch. That leads on to the next question: how do we allocate CPU time? The recursive search structure offers a starting point: each argmax node can be a process, continually searching for better answers to send to its callers. Of course there will be too many of these to make each one a process at the operating system level let alone give each its own CPU core, so now we need to decide how to allocate a small number of CPU cores to many candidate processes. In the worst case this could be regarded as a problem requiring general computation in its own right, but our list of open problems is tediously long as it stands; is there anything more definite we can say about this one? Perhaps there is. There are general arguments that the expected utility of producing some result should be (something isomorphic to) a real number. Further, we can argue that if node A calls B as a subgoal, the utility of B can't exceed that of A. (If a node has multiple callers, its utility is bounded by the sum of caller utilities.) By itself the above reasoning doesn't buy very much; we could still have a scenario where A with a utility of 1 calls B and C giving them both a utility of 1 (maybe both are essential) and so on all the way down. But once we start thinking about costs we can argue that if the amount of resources it's worth spending on A is 1, that must be the amount it's worth spending on A and all its subgoals; so we might end up with A reserving 0.2 units of resources for its own internal computation and giving 0.4 units each to B and C; then we can spend CPU time on each node in proportion to its resource allocation. Now we have something like a usable specification. (cf. e.g. Holland's bucket brigade algorithm.) Memory is trickier than CPU time; more on this later. Comments so far? ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-c97d2393 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-2484a968 Powered by Listbox: http://www.listbox.com
