Marcus Hutter's video is very well done. Now I see what he means. ~PM
> ====================== > Marcus Hutter: "What is intelligence? AIXI and induction" [18:56] > http://www.youtube.com/watch?v=F2bQ5TSB-cE > > Real-world intelligence is resource-bounded. But it's hard to define. So > we take another road: > > phase 1: Define the problem ("intelligence") first, start with the > unbounded version (non-computable). Once we're sure to have this solved: > > phase 2: try to approximate it and make a computational theory out of it. > > (phase 3: now you can still try to create a theory of resource-bounded > intelligence if you want.) > > (Like with universal turing machines: unbounded space and time resources). > > AIXI is theoretical computer science with theoretical general > intelligence. It gives us a model of the capabilities and limitations of > intelligent agents in correspondence to environments. > > Hutter: "Or the short answer may be I am not smart enough to come up with > a resource bounded theory of intelligence, therefore I only developed one > without resource constraints." > > Hutter: "(...) informal definition that intelligence is an agent's ability > to succeed or achieve goals in a wide range of environments." > > Hutter: "Or Universal AI is the general field theory and AIXI is the > particular agent which acts optimally in this sense." > > Planning component, learning compontent. > AIXI agent starts blank (no data/knowledge). Acquire data/knowledge of the > world and build its own model from those data. > How to learn a model from data -> Roots: Kolmogorov complexity, > algorithmic information theory. > > * look for the simplest model that describes your data sufficiently well. > (learning part) > * take this knowledge and think about the best possible outcomes of all > possible actions where "best" is evaluated according to a utility function > (value function) -> rewards. (prediction part) > * Maximize the reward of its lifetime. (Planning part) > > AIXI: it's a mathematical theory of intelligence, one can prove properties > (and one can prove that it's the most intelligent system possible). > Downside: it's incomputable (needs infinite computational resources). > There's the need to approximate it. One of those approximations: Pac-Man: > > Pacman via AIXI Approximation [5:42] > http://www.youtube.com/watch?v=RhQTWidQQ8U > > Playing Pacman using AIXI Approximation [1:52] > http://www.youtube.com/watch?v=yfsMHtmGDKE > > (it starts blank, then via interacting with its environment gains > knowledge. A value function is given before to compute positive and > negative rewards.) > > What's so cool about it is that it's not tailored to any particular > application (like only playing chess or go): interface it with any > problem, it could (theoretically) learn to solve this problem optimally. > There's no built-in pacman knowledge, only the value function. Getting > feedback it learns everything else by itself. > > in approximations: > For the learning part: standard compressors / data compressors. > for the planning part: standard monte carlo random search > > monte carlo algorithms: to search through enormous trees; if one could > search through those huge trees, one would arrive at an optimal solution > (but in reality that's computationally infeasable), but MCs are for > approximations/heuristics (stochastic search). > here: Upper Confidence Bound for Trees (UCT MC algorithm) -> very balanced > way of exploration and exploitation: you search where you think things are > good or where you have very little knowledge and maybe there's a gold > nugget. Fundamental problem: stay where you believe things are good or > explore. > (nice to have: only one parameter to control, where in other algos like > NNs there are sometimes several thousends). > > Essential part of the AI problem: get induction right -> derive models > from data. > use Occam's razor (take the simplest theory consistent with your data), > which has been formalized and quantified -> Kolmogorov complexity > (quantification what complexity or simplicity means). > -> universal theory of induction/prediction: take past data stream, ask > "what comes next". > universal predictor that works in any kind of situation. (it's > incomputable, but beautiful, later you approximate it) > > Bayesian reasoning is built into AIXI. > > sequential decision theory > > ===================== > > Tim Tyler: On AIXI > http://www.youtube.com/watch?v=xDMN4zi7wb4 > > 1. problem: Has no representation of self. It's not embedded in its > environment. But that's not a serious flaw. > > 2. problem: wirehead problem -> hacking its own reward feedback, which > will endanger its long-term survival. > > 3. problem: world is parallel, AIXI agent is a serial agent modeled by a > TM. While parallelism can be modeled sequentially, the reward model is > also serial and thus unsuitable for a parallel world. (not a serious > problem). > > 4. problem: solomonoff induction is a formalized version of Occam's razor > using kolmogoroff complexity (not a serious problem) -> it's not > language-independent and it's not known whether there exists an optimal > description of Occam's razor. > > Ben Goertzel: AIXI shows that AGI is a problem of resource restrictions, > if there were no space and time constraints, it'd be a trivial program. > > From the video comments: AIXI has no access to its own reasoning. That's > true for reinforcement learning that treats the brain as a black box, thus > it can't explain its own reasoning. > > ===================== > > Marcus Hutter - AI, the Scientific Method & Philosophy > http://www.youtube.com/watch?v=slTuDZIJqkQ > > Science is very much about induction: get data, derive models. -> > Solomonoff Induction > It's also about decision making and planning (that's the active part). -> > AIXI > > You can always ask "why", but to prevent an infinite regress, you have to > stop somewhere and declare that something are the axioms and ask about > their consequences. When they are useful, you can stop questioning (you > could go on but for practical reason you stop somewhere and proceed with > what you have). > When you do this process (ask "why why why why why") often enough, you > arrive at the Occam's Razor principle. It seems to be necessary and > sufficient for science. That's defining science and OR is about the > scientific method. There might be better principles than OR, but currently > it's the best we have. Just use it until someone has found something > better. > > Issues on free will. > Closed system: can be predicted from outside. > Open system (here: give feedback into it): put yourself into the closed > system and everything's fine again. > > ===================== > > A computational approximation to the AIXI model (AGI 2008) > http://www.youtube.com/watch?v=SpgXXfRqNAk > > AIXI: control theory (expectation maximazation) + universal induction > (Solomonoff induction) -> optimal behavior. > > Problem: Find a computationally efficient (if not optimal) approximation > for the optimal but incomputable AIXI theory. > > Universal induction solves problem of choosing a prior to achieve optimal > inductive inference. > > ===================== > > Marcus Hutter: Foundations of Intelligent Agents > http://www.youtube.com/watch?v=x8btbKaRfoc > > Informal working definition: Intelligence measures an agent's ability to > perform well in a wide range of environments. > > Design from first principles of Artificial Intelligent Systems: > > * Logic/language based: expert/reasoning/proving/cognitive systems > * Economics inspired: utility, sequential decisions, game theory > * Cybernetics: adaptive dynamic control > * Machine Learning: reinforcement learning > * Information processing: data compression -> intelligence > Separately too limited for AGI, but jointly very powerful. > > Foundations of Universal Artificial Intelligence: > > * Philosophy: Ockham, Epicurus, Induction > * Mathematics: Information, complexity, Bayesian & Algorithmic > Probability, Solomonoff Induction, Sequential Decision > * Frameworks: Rational Agents (in known and unknown environments) > * Computation: Universal search and feature Reinforcement Learning > > Science is about induction (Ockham's Razor): take the simplest hypothesis > consistent with the data > Induction: go from one to the next > 1. construct set of possible nexts > 2. choose one next > Is the most important principle in science and ML > > Problem: Quantification of simplicity/complexity (because a machine has to > apply Ockham's Razor) > -> Due to the Turing's Thesis, everything computable by a human using a > fixed procedure can also be computed by a (universal) Turing Machine > -> Measure of complexity: Kolmogorov Complexity, Algorithmic Information > Theory => Kolmogorov complexity of a string is the length of the shortest > program on U describing the string. > > K(s) := min_p {Length(p): U(p) = s}. // U(p) is a program computing s, > pick the shortest one. > > -> Bayesian Probability Theory: update prior degree of belief in > hypothesis H, given new observations D, to posterior belief in H. > > Pr(H|D) \propto Pr(D|H)Pr(H). > > Alg. Inf. Theo: how to initialize beliefs > Bayes: how to update beliefs > > -> Algorithmic Probability > Epicurus: if more than one theory=hypothesis=model is consistent with the > observations, keep them all. > Refinement with Ockham: Give simpler theories higher a-priori weight. > Quantitative: Pr(H) := 2^{-K(H)} > i.e., keep them, but weight them > > => Universal Induction (by Solomonoff): > combined Ockham, Epicurus, Bayes, Turing into one formal theory of > sequential prediction. > > Universal a-priori probability: M(x) := prob that U fed with noise outputs > x, i.e., what is the prob that randomness produces x. > M(x_{t+1} | x_1, ..., x_t) best predicts x_{t+1} from x_1, ..., x_t. > > => Sequential Decision Theory (Optimal Control Theory) > > for t = 1, 2, ..., given sequence x_1, x_2, ..., x_{t-1}: > 1) make decision y_t > 2) observe x_t > 3) suffer Loss(x_t, y_t) > 4) t -> t+1, goto 1) > Goal: minimize expected Loss > Problem: true prob unknown > Solution: use Solomonoff's M(x) > > => Agent Model (extremely general): > agent interacts with environments in cycles t, t+1, ... and receives > pos/neg reinforcement feedback > > AIXI = AI + greek letter Xi > * universally optimal rational agent > * ultimate Super Intelligence > * computationally intractable > * could serve as a gold standard for AGI > > => Towards practical universal AI (efficient general-purpose intelligent > agents) > Additional ingredients: > * universal search (Schmidhuber) > * learning: mostly Reinforcement Learning > * information: Minimal Description Length Principle > * complexity/similarity > * optimization, esp. Monte Carlo > > Feature Reinforcement Learning > reduce real-world problem into a (tractable) Markov Decision Process by > learning relevant features. > > ===================== > > > ------------------------------------------- > AGI > Archives: https://www.listbox.com/member/archive/303/=now > RSS Feed: https://www.listbox.com/member/archive/rss/303/24379807-f5817f28 > Modify Your Subscription: https://www.listbox.com/member/?& > Powered by Listbox: http://www.listbox.com AGI | Archives | Modify Your Subscription ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
