Re: [agi] Complexity of environment of agi agent
> By the way: > I'm going to use BPTT (backpropagation through time) to train the recurrent > neural network. According to Schmidhuber at IDSIA there is a better > recurrent neural network system Long Short Term Memory (LSTM). This network > + training technique is claimed to be able to look further back (>1000 > steps in stead of just 10 for BPTT). Is anyone familiair with this? Can it > be put in the same layering architecture that I am aiming at, i.e. does it > have an abstraction state? >From what I've read about prediction with recurrent neural networks (and similar systems), it looks to me like only systems that predict detail information are investigated. The main AI strategy there is to make systems that predict detail states as far as possible: 100 steps in the future, 1000 steps... 1... But I don't think that will ever go so far as to make long term prediction and planning possible. In a realistic environment a system will never be able to predict all the detail states the system will be in over the range of a year or so. A system will at most be able to predict in which class the detail information will fall. And the farther into the future the prediction, the larger such a class will tend to be, i.e. the prediction will be in more general terms. There are symbolic reasoning systems that try to plan globally in general terms. But with symbolic computation the problem is that the symbols lack semantic content. There is no or are only bad connections to perception and action and different layers of abstraction. These systems just reason in their own private syntactic Language of Thought, in an autistic way. Or they can only act successful in simple 'toy' worlds. A reason I can think of for not trying to predict the internal state of a recurrent neural network is that then a prediction is made about an internal construction, not something external, and furthermore this prediction has no further effects (for prediction). It stays internal. The prediction of the internal state of the recurrent network is seen as a goal on its own. It is not related to predicting the detail level better. This is counterintuitive. If it doesn't help at the detail level then it seems useless. However, the internal state is determined by the external detail level (the perceived sequence). And therefor, indirectly prediction of the internal state contains information about the future of the external environment. The building up of knowledge, concept formation and abstraction, when there are multiple recursive layers of internal state predicting (of a lower layer) recurrent neural networks, can be the input for action systems (also neural networks) at each layer. The actions at the higher levels are abstract plans or goals (formulated in the representation of the layer below) spanning a long range of time (the typical time range of the predictions of that layer). The action system passes such a plan or goal to the layer below where it is the input for the action system there. The goals/plans of the lower levels become more concrete and short term. At the bottom the action is really concrete, acting in the environment through effectors ('muscles'). (All action systems try to maximise an external reinforcement signal through their actions.) So this is how the high level abstract representation states have their effect on the detail level; through action systems, the results of which are passed down via via to the detail level. Arnoud --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
> A more natural way seems to me: predict what state the PRS is in the next > time a prediction failure (of the base level) happens. The PRS can be seen > as containing all the information about the future (also future PRSes) > until a prediction failure. Apparantly the environment then switches to > some new unforseen behaviour. Until then the environment did not contain > any new information; it can be compressed to the PRS (and the Abstraction > and Prediction neural networks which are fixed during run time). A correction: The Prediction neural network computes the predicted next input on basis of the most recent input also (Pr: C, I -> Ipred). Therefor, even if the prediction succeeds, it can be that the PRS does not contain all the information to construct the future input. The sequence untill a prediction failure is compressible to the PSR plus the present input. Because of this the PSR can be viewed/operate as an abstract classification without detail information (which is stored in the present input I) (although the PSR can also hold the detail information in very simple environments) , so it's not an unimportant point. second: The PRS that is predicted is the PRS (= context C) after the prediction failure. By the way: I'm going to use BPTT (backpropagation through time) to train the recurrent neural network. According to Schmidhuber at IDSIA there is a better recurrent neural network system Long Short Term Memory (LSTM). This network + training technique is claimed to be able to look further back (>1000 steps in stead of just 10 for BPTT). Is anyone familiair with this? Can it be put in the same layering architecture that I am aiming at, i.e. does it have an abstraction state? Bye, Arnoud --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
For example > you could have some sequence where the computation time of the n-th > bit take n^1000 computation cycles. There is only one pattern and > it's highly compressible as it has a pretty short algorithm however > there is no way you'll ever learn what the pattern is. To say a bit more/correct myself/repeat what you said/repeat myself: Every long sequence where the number of computation cycles needed for computation the nth element is not smaller than a constant C is not practically compressible. The amount of information that the nth element depends on is then also limited by some constant. If computation time grows like 2^n, n^1000, n or log(n), computation will become unfeasible in the long run. For log(n) that might be a very long time, but still eventually impossible. This seems to me to be a more useful notion of compressibility than Kolmogorov's: feasible compressibility. If the nth element just depends on the near past (say the last 5 elements or so) and the number of computations is bounded by some not very high constant finding a good prediction function is easy (with neural networks at least). But that is not the case in general. There may have been cardinal events in the past that need to be remembered in order to predict well. In that case computation can also be feasible. Of course if the determining events are too far in the past it's impossible to find the causal connection. The trick is to identify those cardinal events, and then remember them, completely or just a part. For this I use a recurrent neural network system (Abstraction/Prediction). It is trained on prediction success. There is always a fixed number of calculations based on a fixed amount of information. Abstraction (input, context) -> new context Prediction (input ,context) -> predicted new input Alas, a known problem with recurrent networks is that with backpropagation of errors the contributions of the far past to the error gradient go to zero too fast. (I remember, I may have stated this problem wrong) A module in my system on its own can therefor only predict well on the basis of the near past. That near past and the near future must be feasibly compressible. Whether the rest is does not matter for this method. A method help to solve (live with) this problem: The state of the recurrent neural network (context) can be seen as an abstract description of the state the (perceived) sequence is in. The recent past is compressed into it (it can be said to simulate the Markov property), all relevant information is kept because of the training on prediction. irrelevant information is filtered out. It can be seen as a classification of the near past sequence (and the expected near future), which I mentioned in earlier mails. Sutton calls such state descriptions that help to predict predictive representations of state, PRS. These states can themselves be predicted, by a higher level module. To get grip on a farther past than the base level can, this module must run 'slower', i.e. not try to predict every PRS every time. It must predict the state of the PSR every 2, 10 or so cycles of the base level. Otherwise it will have the same time horizon as the base level. A more natural way seems to me: predict what state the PRS is in the next time a prediction failure (of the base level) happens. The PRS can be seen as containing all the information about the future (also future PRSes) until a prediction failure. Apparantly the environment then switches to some new unforseen behaviour. Until then the environment did not contain any new information; it can be compressed to the PRS (and the Abstraction and Prediction neural networks which are fixed during run time). The higher level module predicts the new PRS of the base level. It thus predicts in which abstract class the sequence will fall when it 'switches' to a new behaviour. If it is successful it can be said to have a farther predictive horizon than the base module. The drawback is that the predictions are stated less precise, in terms of a class of sequences, the PRS (if the environment is so complex that information had to be filtered out). Will it work? Only if the environment (and therefor the sequence) is simple enough. The perceived sequence must be feasibly compressible in the abstract representations of the PRS (of the base level), in the near past and the near future. The horizons are farther away than for the base module. After that a third module can be build on the second to predict the PRS of the second module, and so on. If the environment allows it, the future will be predicted further and further, but also vaguer and vaguer, or maybe better said, in more general terms (in complex but still feasible environments). The input of the top level will be the cause of many different sequences. This method seems to me to make long term planning possible in feasibly compressible environments in general terms.
Re: [agi] Complexity of environment of agi agent
Hi Shane, how's the barn holding, > Perhaps my pattern wasn't clear enough > > 1 > 2 > 3 > 4 > . > . > . > 00099 > 00100 > 00101 > . > . > . > 0 > 1 > . > . > . > 8 > 9 > > then repeat from the start again. However each character is > part of the sequence. So the agent sees 10002300... > OK, now I see what you mean. > So the "whole pattern" in some sense is 100,000 "numbers" > each of 5 characters giving a 500,000 character pattern > of digits from 0 to 9. A human can learn this reasonably > easily but your AI won't. It would take something more > like a mega byte to store the pattern. Actually with > the overhead of all the rules it would be much bigger. My agent has these patterns for breakfast! I certainly hope so, at least. Well, there is a very simple rule here, namely just add 1 arithmically to the last 5 inputs, and then you successfully predict the next five inputs. Can my system represent that rule? I think it can. If I simplify my system so that is does not act, just perceive and predict, there are 2 neural networks in one module (and I only need one): Abstraction(C, I) -> new C Prediction (C, I) -> predicted I with C being the context vector (of bits), and I the input vector (of bits). Abstraction must make sure that C has all the relevant information stored in it. C may contain the last 10 inputs. Or the five of the last block, and a counter for where it is in the new block. Prediction must then perform the operation of adding 1 and giving as output the value at the counter place plus 1. E.g. 00012 was the last block, the counter is 4. This is stored in C. Prediction calculates 00013 outof C and takes the fourth plus one character being '3'. Abstraction({00012,3},{1}) -> {00012,4} Prediction({00012,4},{1}) -> {3} That is about how it will work, I suppose. If you see an error (or you have other patterns to think about) please say so. Of course Prediction and Abstraction will only work this way after a lot of training (prediction error minimalisation). (A nice deus ex machina I can always fall back on ;-) Maybe I've misled you with my recent mails of talk about storing patterns. In a way it does, that but not explicitely. It stores a mechanism how to extent a perceived sequence, i.e. predict the next step. Hoi, Arnoud --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
Ciao Arnoud, Perhaps my pattern wasn't clear enough 1 2 3 4 . . . 00099 00100 00101 . . . 0 1 . . . 8 9 then repeat from the start again. However each character is part of the sequence. So the agent sees 10002300... So the "whole pattern" in some sense is 100,000 "numbers" each of 5 characters giving a 500,000 character pattern of digits from 0 to 9. A human can learn this reasonably easily but your AI won't. It would take something more like a mega byte to store the pattern. Actually with the overhead of all the rules it would be much bigger. Shane --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
Hoi Shane, > > Do I have to see it like something that the value of the nth bit is a > > (complex) function of all the former bits? Then it makes sense to me. > > After some length l of the pattern computation becomes unfeasible. > > But this is not the way I intend my system to handle patterns. It learns > > the pattern after a lot of repeted occurences of it (in perception). And > > then it just stores the whole pattern ;-) No compression there. But since > > the environment is made outof smaller patterns, the pattern can be > > formulated in those smaller patterns, and thus save memory space. > > This is ok, but it does limit the sorts of things that your system > is able to do. I actually suspect that humans do a lot of very > simple pattern matching like you suggest and in some sense "fake" > being able to work out complex looking patterns. It's just that > we have seen so many patterns in the past and that we are very > good at doing fast and sometimes slightly abstract pattern matching > on a huge database of experience. Nevertheless you need to be a > little careful because some very "simple" patterns that don't > repeat in a very explicit way could totally confuse your system: > > 123.9123 My system sees that '9' as noise. As you know I use neural networks to filter that out. The '9' would make it uncertain as its predictions fail, it would begin to forget in what pattern it was, start hypothesing based on past experience which pattern it is in now. But when the pattern continues in the same way again after '9', the state it was beginning to forget starts to be reactivated again. Of course neural networks are black boxes, you don't really know how to translate its operations to high-level cognitive terms. What my system stores in the neural networks are not explicit patterns, but just causal rules: given some time sequence e.g. 010010010110010 it can calculate the next bit (or actually, a vector of bits), being its prediction. So this talk of storing 1 2 3 or n patterns is just convenient and should be taken with a grain of salt; it's unnatural and too rigid. > > Your system, if I understand correctly, would not see the pattern > until it had seen the whole cycle several times. Then it would begin to see it as a pattern that should be remembered, not as just some noise. What counts is repetition of the pattern. It is then likely that the pattern will occur again in the future, and that is what my agent wants to get a grip on. Something like > 5*100,000*2 = 1,000,000 characters You mean here something like 5 repetitions of 100,000 characters and that then repeated twice? into the sequence and even then > it would need to remember 100,000 characters of information. A > human would see the pattern after just a few characters with > perhaps some uncertainly as to what will happen after the 9. > The total storage required for the pattern with a human would be > far less than 100,000 characters your system would need too. I didn't understand quite what you meant. But as I said my system stores rules (in a neural network fuzzy way) on how to continue a sequence. It does not have to store sequences explicitely. And if there are regularities in the sequence, the rules that can build up that sequence take less space than the sequence itself. (I hope this is something of an answer.) > > > Yes, but in general you don't know the complexity of the simplest > > solution of the problem in advance. It's more likely that you get to know > > first what the complexity of the environment is. > > In general an agent doesn't know the complexity of its > environment either. OK OK, but I think it's more likely. In the beginning, it is exposed to its environment and doesn't know how to solve the problem, it doesn't know how to discriminate the relevant phenomena yet. If it has to estimate the complexity of the solutiuon of the problem then, it would be based on all the phenomena, not just the relevant ones. > > > The strategy I'm proposing is: ignore everything that is too complex. > > Just forget about it and hope you can, otherwise it's just bad luck. Of > > course you want to do the very best to solve the problem, and that > > entails that some complex phenomenon that can be handled must not be > > ignored a priori; it must only be ignored if there is evidence that > > understanding that phenomenon does not help solving your the problem. > > In order for this strategy to work you need to know what the maximum > > complexity is an agent can handle, as a function of the resources of the > > agent: Cmax(R). And it would be very helpful for making design decisions > > to know Cmax(R) in advance. You can then build in that everything above > > Cmax(R) should be ignored; 'vette pech' as we say in Dutch if you then > > are not able to solve the problem. > > Why not just do this dynamically? Try to look at how much of > the ag
Re: [agi] Complexity of environment of agi agent
Arnoud, I'm not sure if this makes much sense. An "ideal" agent is not going to be a "realistic" agent. The bigger your computer and the better your software more complexity your agent will be able to deal with. With an ideal realistic agent I meant the best software we can make on the best hardware we can make. In which case I think the question is pretty much impossible to answer. Who knows what the best hardware we can make is? Who knows what the best software we can make is? Do I have to see it like something that the value of the nth bit is a (complex) function of all the former bits? Then it makes sense to me. After some length l of the pattern computation becomes unfeasible. But this is not the way I intend my system to handle patterns. It learns the pattern after a lot of repeted occurences of it (in perception). And then it just stores the whole pattern ;-) No compression there. But since the environment is made outof smaller patterns, the pattern can be formulated in those smaller patterns, and thus save memory space. This is ok, but it does limit the sorts of things that your system is able to do. I actually suspect that humans do a lot of very simple pattern matching like you suggest and in some sense "fake" being able to work out complex looking patterns. It's just that we have seen so many patterns in the past and that we are very good at doing fast and sometimes slightly abstract pattern matching on a huge database of experience. Nevertheless you need to be a little careful because some very "simple" patterns that don't repeat in a very explicit way could totally confuse your system: 123.9123 Your system, if I understand correctly, would not see the pattern until it had seen the whole cycle several times. Something like 5*100,000*2 = 1,000,000 characters into the sequence and even then it would need to remember 100,000 characters of information. A human would see the pattern after just a few characters with perhaps some uncertainly as to what will happen after the 9. The total storage required for the pattern with a human would be far less than 100,000 characters your system would need too. Yes, but in general you don't know the complexity of the simplest solution of the problem in advance. It's more likely that you get to know first what the complexity of the environment is. In general an agent doesn't know the complexity of its environment either. The strategy I'm proposing is: ignore everything that is too complex. Just forget about it and hope you can, otherwise it's just bad luck. Of course you want to do the very best to solve the problem, and that entails that some complex phenomenon that can be handled must not be ignored a priori; it must only be ignored if there is evidence that understanding that phenomenon does not help solving your the problem. In order for this strategy to work you need to know what the maximum complexity is an agent can handle, as a function of the resources of the agent: Cmax(R). And it would be very helpful for making design decisions to know Cmax(R) in advance. You can then build in that everything above Cmax(R) should be ignored; 'vette pech' as we say in Dutch if you then are not able to solve the problem. Why not just do this dynamically? Try to look at how much of the agent's resources are being used for something and how much benefit the agent is getting from this. If something else comes along that seems to have a better ratio of benefit to resource usage then throw away some of the older stuff to free up resources for this new thing. Shane --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
Hi Shane, On Friday 19 September 2003 02:58, Shane Legg wrote: > arnoud wrote: > > How large can those constants be? How complex may the environment be > > maximally for an ideal, but still realistic, agi agent (thus not a > > solomonof or AIXI agent) to be still succesful? Does somebody know how to > > calculate (and formalise) this? > > I'm not sure if this makes much sense. An "ideal" agent is not going > to be a "realistic" agent. The bigger your computer and the better > your software more complexity your agent will be able to deal with. With an ideal realistic agent I meant the best software we can make on the best hardware we can make. > > The only way I could see that it would make sense would be if you > could come up with an algorithm and prove that it made the best > possible usage of time and space in terms of achieving its goals. > Then the constants you are talking about would be set by this > algorithm and the size of the biggest computer you could get. Yes, but then the agent is made already. I think some estimates of the constants would help me to make design decisions. But if the constants can only be determined afterwards, they are of no use to me. > > > Not even an educated guess? > > > > But I think some things can be said: > > Suppose perception of the environment is just a bit at a time: > > ...010100010010010111010101010... > > > > In the random case: for any sequence of length l the number of possible > > patterns is 2^l. Completely hopeless, unless prediction precision need > > decreases also exponentially with l. But that is not realistic. You then > > know nothing, but you want nothing also. > > Yes, this defines the limiting case for Solomonoff Induction... > > > in the logarithmic case: the number of possible patterns of length l > > increases logarithmically with l: #p < constant * log(l). If the constant > > is not to high this environment can be learned easily. There is no need > > for vagueness > > Not true. Just because the sequence is very compressible in a > Kolmogorov sense doesn't imply that it's easy to learn. For example > you could have some sequence where the computation time of the n-th > bit take n^1000 computation cycles. There is only one pattern and > it's highly compressible as it has a pretty short algorithm however > there is no way you'll ever learn what the pattern is. Do I have to see it like something that the value of the nth bit is a (complex) function of all the former bits? Then it makes sense to me. After some length l of the pattern computation becomes unfeasible. But this is not the way I intend my system to handle patterns. It learns the pattern after a lot of repeted occurences of it (in perception). And then it just stores the whole pattern ;-) No compression there. But since the environment is made outof smaller patterns, the pattern can be formulated in those smaller patterns, and thus save memory space. In the logarithmic case: say there are 2 patterns of length 100, then there are 3 patterns of length 1000. Let's say the 2 patterns of l = 100 are primitive and are stored bit by bit. The 3 patterns however can be stored using 10 bits for each. The 4 patterns of length 10^4 can be stored using 16 bits for each, etc etc. It isn't really different in the linear case, except that number of patterns that can be found in the environment grows linearly with l, and there's need for abstraction (i.e. storing as classes of sequences, lossy data compression). > > > I suppose the point I'm trying to make is that complexity of the > > environment is not all. It's is also important to know how many of the > > complexity can be ignored. > > Yes. The real measure of how difficult an environment is is not > the complexity of the environment, but rather the complexity of > the simplest solution to the problem that you need to solve in > that environment. Yes, but in general you don't know the complexity of the simplest solution of the problem in advance. It's more likely that you get to know first what the complexity of the environment is. The strategy I'm proposing is: ignore everything that is too complex. Just forget about it and hope you can, otherwise it's just bad luck. Of course you want to do the very best to solve the problem, and that entails that some complex phenomenon that can be handled must not be ignored a priori; it must only be ignored if there is evidence that understanding that phenomenon does not help solving your the problem. In order for this strategy to work you need to know what the maximum complexity is an agent can handle, as a function of the resources of the agent: Cmax(R). And it would be very helpful for making design decisions to know Cmax(R) in advance. You can then build in that everything above Cmax(R) should be ignored; 'vette pech' as we say in Dutch if you then are not able to solve the problem. > > Shane > > P.S. one of these days I'm going to get around to replying to your >
Re: [agi] Complexity of environment of agi agent
arnoud wrote: How large can those constants be? How complex may the environment be maximally for an ideal, but still realistic, agi agent (thus not a solomonof or AIXI agent) to be still succesful? Does somebody know how to calculate (and formalise) this? I'm not sure if this makes much sense. An "ideal" agent is not going to be a "realistic" agent. The bigger your computer and the better your software more complexity your agent will be able to deal with. The only way I could see that it would make sense would be if you could come up with an algorithm and prove that it made the best possible usage of time and space in terms of achieving its goals. Then the constants you are talking about would be set by this algorithm and the size of the biggest computer you could get. Not even an educated guess? But I think some things can be said: Suppose perception of the environment is just a bit at a time: ...010100010010010111010101010... In the random case: for any sequence of length l the number of possible patterns is 2^l. Completely hopeless, unless prediction precision need decreases also exponentially with l. But that is not realistic. You then know nothing, but you want nothing also. Yes, this defines the limiting case for Solomonoff Induction... in the logarithmic case: the number of possible patterns of length l increases logarithmically with l: #p < constant * log(l). If the constant is not to high this environment can be learned easily. There is no need for vagueness Not true. Just because the sequence is very compressible in a Kolmogorov sense doesn't imply that it's easy to learn. For example you could have some sequence where the computation time of the n-th bit take n^1000 computation cycles. There is only one pattern and it's highly compressible as it has a pretty short algorithm however there is no way you'll ever learn what the pattern is. I suppose the point I'm trying to make is that complexity of the environment is not all. It's is also important to know how many of the complexity can be ignored. Yes. The real measure of how difficult an environment is is not the complexity of the environment, but rather the complexity of the simplest solution to the problem that you need to solve in that environment. Shane P.S. one of these days I'm going to get around to replying to your other emails to me!! sorry about the delay! --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
Re: [agi] Complexity of environment of agi agent
On Thursday 18 September 2003 13:29, Ben Goertzel wrote: > There are two different questions here, and I'm not sure which one you mean > > Given a set of computational resources R, we can ask either > > 1) > what is a complexity level C so that, for any environment E of complexity > level C, there is *some* AI system running on R that can predict events in > E reasonably well (i.e. assuming that R is specialized for E)? Yes, that is what I mean. To make it more complete, that 'reasonably well' is also a variable, or 2: prediction success ratio (0 - 100%), precision/detailedness of prediction divided by prediction time. The minimum prediction success is also a given ('Given a set of computational resources R and required minimum prediction succes M...'). The precision of the prediction divided by the timescale is a measure of the amount of complexity C the AI system can ignore. This depends not only on the environment, but also on the goals the AI system must achieve. > > or > > 2) > what is a complexity level C so that, for some AI system X running on R, > and for *any* environment E of complexity C, X can predict events in E > reasonably well (i.e. assuming that R is not specialized for E)? That is not quite what I meant. I'm busy with tabula rasa systems that learn about the environment from scratch. That makes 1) and 2) more or less equivalent for me. > > > The human brain for instance is highly specialized to certain environments, > and cannot predict equally intelligently in other environments of > comparable "a priori complexity" to the environments for which it's > specialized. > > Note that my computational resources R includes both space and time > resources. Unless the time resources are unrealistically ample, this rules > out answering 2) by using an AI system X that does an AIXItl style search > through all programs that could run on resources (R - epsilon) and finding > the optimal one for the given environment E. > > Of course, contemporary mathematics does not give us answers to either of > these questions. Nobody knows > Not even an educated guess? But I think some things can be said: Suppose perception of the environment is just a bit at a time: ...010100010010010111010101010... In the random case: for any sequence of length l the number of possible patterns is 2^l. Completely hopeless, unless prediction precision need decreases also exponentially with l. But that is not realistic. You then know nothing, but you want nothing also. in the logarithmic case: the number of possible patterns of length l increases logarithmically with l: #p < constant * log(l). If the constant is not to high this environment can be learned easily. There is no need for vagueness in prediction (because of limited resources), or for prediction in only general terms (maybe just a very slow, logarithmic decrease of precision). But this is not realistic. Real environments are more complex. On the very long long (infinite) term I suppose the environment has to be this way (for the agent to be able to predict), or even no increase in number of patterns after some critical l (just repetition). Or the environment consists outof (super)patterns that can also be found on lower time levels (the environment has a recursive structure). In the linear case: the number of possible patterns of length l increases linearly with length l: #p < constant * l. Given R this environment cannot be predicted on the long term (after some l) if prediction has to stay precise. The prediction precision need also has to decrease linearly. This is, I think, the most realistic scenario. I think this is about the world we live in. Near future predictions need to be very precise to be able to act well, and that precision decreases about linearly as the time horizon is farther. (The same can be said for action and planning) I don't need to know where exactly my foot will be over a year, and exactly at what time of the day. I just need to know in very general terms what my situation will be like. In a year I will have a better job. I predict that in a week my neighbours will pay me a visit. I predict in a second my hands will hit the key '.'. This not just something about me (or people in general) it's also a property of the environment. Our world (and our needed goals for survival in it) is such that we don't need very precise and detailed descriptions of what will happen in the far future. In this world you can manage without that. However, we can not totally do without it. I suppose the point I'm trying to make is that complexity of the environment is not all. It's is also important to know how many of the complexity can be ignored. The hardest problem for AI is successfull long term planning. It is needed, but nobody knows how to build an AI system that does it. I think that a key element in the solution is to assume that precision of prediction can drop linearly with time. The search
Re: [agi] Complexity of environment of agi agent
Hi Arnoud, Two points: 1. Most real environments are open, non-linear systems, which means they generally have some characteristic time horizon beyond which prediction is no better than a guess based on historical statistics, no matter how much computation is used. For the weather, for example, predictions beyond two or three weeks will never be much better than just using climatological averages. 2. How well an agent has to do is often relative to other competing agents, or when there is no competition all an agent can expect is "as good as possible". For example, in predicting the stock market an agent only has to be better than competing agents to make money. And in predicting the weather, there is a real limit on how well an agent can do. Cheers, Bill -- Bill Hibbard, SSEC, 1225 W. Dayton St., Madison, WI 53706 [EMAIL PROTECTED] 608-263-4427 fax: 608-263-6738 http://www.ssec.wisc.edu/~billh/vis.html On Thu, 18 Sep 2003, arnoud wrote: > Dear AGI, > > What is the maximal complexity of an environment in which prediction of future > events in the environment is still computationally feasible? > > In most 'realistic' environments prediction of the near future needs to be > very precise and prediction of the far future can be vaguer, i.e. a large > class of event types satisfies the prediction, in order for an agi agent to > achieve its goals. (If prediction on the very long term also needs to be > detailed and precise the environment is impossible for any agent, not?). > > In a formula, : > (maximal vagueness of prediction that is allowed by agi agent, in order to > plan and act successful to achieve goals) / (time scale of horizon of > prediction) = some constant. > > Vagueness of prediction is the number of perception event types that satify > the prediction. (count the perception event types at bit level). > > somewhat equivalent formula: > (number of patterns of about time length l that occur in the environment) / > (about l) = some constant > > A pattern here is a class of noisy variations on a pattern (10% noise, 20% > noise? The smaller the constant gets the higher the noise ratio can be.). > > How large can those constants be? How complex may the environment be maximally > for an ideal, but still realistic, agi agent (thus not a solomonof or AIXI > agent) to be still succesful? Does somebody know how to calculate (and > formalise) this? > > Bye, > Arnoud > > > --- > To unsubscribe, change your address, or temporarily deactivate your subscription, > please go to http://v2.listbox.com/member/[EMAIL PROTECTED] > > --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
RE: [agi] Complexity of environment of agi agent
> How complex may the environment > be maximally > for an ideal, but still realistic, agi agent (thus not a > solomonof or AIXI > agent) to be still succesful? Does somebody know how to calculate (and > formalise) this? > > Bye, > Arnoud There are two different questions here, and I'm not sure which one you mean Given a set of computational resources R, we can ask either 1) what is a complexity level C so that, for any environment E of complexity level C, there is *some* AI system running on R that can predict events in E reasonably well (i.e. assuming that R is specialized for E)? or 2) what is a complexity level C so that, for some AI system X running on R, and for *any* environment E of complexity C, X can predict events in E reasonably well (i.e. assuming that R is not specialized for E)? The human brain for instance is highly specialized to certain environments, and cannot predict equally intelligently in other environments of comparable "a priori complexity" to the environments for which it's specialized. Note that my computational resources R includes both space and time resources. Unless the time resources are unrealistically ample, this rules out answering 2) by using an AI system X that does an AIXItl style search through all programs that could run on resources (R - epsilon) and finding the optimal one for the given environment E. Of course, contemporary mathematics does not give us answers to either of these questions. Nobody knows -- Ben G --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]
[agi] Complexity of environment of agi agent
Dear AGI, What is the maximal complexity of an environment in which prediction of future events in the environment is still computationally feasible? In most 'realistic' environments prediction of the near future needs to be very precise and prediction of the far future can be vaguer, i.e. a large class of event types satisfies the prediction, in order for an agi agent to achieve its goals. (If prediction on the very long term also needs to be detailed and precise the environment is impossible for any agent, not?). In a formula, : (maximal vagueness of prediction that is allowed by agi agent, in order to plan and act successful to achieve goals) / (time scale of horizon of prediction) = some constant. Vagueness of prediction is the number of perception event types that satify the prediction. (count the perception event types at bit level). somewhat equivalent formula: (number of patterns of about time length l that occur in the environment) / (about l) = some constant A pattern here is a class of noisy variations on a pattern (10% noise, 20% noise? The smaller the constant gets the higher the noise ratio can be.). How large can those constants be? How complex may the environment be maximally for an ideal, but still realistic, agi agent (thus not a solomonof or AIXI agent) to be still succesful? Does somebody know how to calculate (and formalise) this? Bye, Arnoud --- To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED]