Steve, Dp/dt methods do not fundamentally change the space of possible models (if your initial mathematical claim of equivalence is true). What I am saying is that that model space is *far* too small. Perhaps you know some grammar theory? Markov models are not even as expressive as regular grammars. Hidden markov models are. But there is a long way to go from there, since that is just the first level of the hierarchy.
> By "Markov" you are referring to successive computation processes, e.g. > layers of neurons, each feeding the next? For sequential data, an Nth-order markov model is a model that predicts the next item in the sequence from the last N items. These can be built by making an n-dimensional table, and running through the data to count what item appears after each occurrence of each n-item subsequence. Equivalently, an nth-order markov model might store the probability (/frequency) of each possible sequence of length N+1; in that case we've got to do some extra calculations to get predictions out of the model, but mathematically speaking, we've got the same information in our hands. Markov models can be extended to spatial data by counting the probabilities of (all possible) squares of some fixed size. (Circles would work fine too.) >> Markov models are highly prone to overmatching the >> dataset when they become high-order. > > > Only because the principal components haven't been accurately sorted out by > dp/dt methods? The reason that overmatching becomes a problem is that the size of the table grows exponentially with N. There is simply not enough data to fill the table properly. Let's see... where normal methods would give a variable values 1 or 0, derivatives would allow 1, 0, and -1 (positive change, no change, negative change). So for discrete data, dp/dt will actually make the tables bigger. This could improve discrimination for low-order models (similar to the effect if increasing the order), but it will make overmatching worse for higher-order models (again, similar to the effect of increasing the order). of course, there is an added bonus if the data's regularity really is represented better by the derivatives. Come to think of it, it shouldn't be surprising that working in derivative space is like increasing the order... each unit of derivative-data represents (the difference between) two units of normal data. --Abram On Wed, Jan 7, 2009 at 1:40 PM, Steve Richfield <[email protected]> wrote: > Abram, > > On 1/6/09, Abram Demski <[email protected]> wrote: >> >> Well, I *still* think you are wasting your time with "flat" >> (propositional) learning. > > > I'm not at all sure that I understand what you are saying here, so some > elaboration is probably in order. >> >> I'm not saying there isn't still progress to >> be made in this area, but I just don't see it as an area where >> progress is critical. > > > My guess is that the poor performance of non dp/dt methods is depressing, so > everyone wants to look elsewhere. Damn that yellow stuff, I'm looking for > SILVER. My hope/expectation is that this field can be supercharged with > dp/dt methods. >> >> The main thing that we can do with propositional >> models when we're dealing with relational data is construct >> markov-models. > > > By "Markov" you are referring to successive computation processes, e.g. > layers of neurons, each feeding the next? >> >> Markov models are highly prone to overmatching the >> dataset when they become high-order. > > > Only because the principal components haven't been accurately sorted out by > dp/dt methods? >> >> So far as I am aware, >> improvements to propositional models mainly improve performance for >> large numbers of variables, since there isn't much to gain with only a >> few variables. > > > Again, hoping that enough redundancy can deal with the overlapping effects > of things that occur together, a problem generally eliminated by dp/dt > methods. >> >> (FYI, I don't have much evidence to back up that >> claim.) > > > When I finally get this all wrung out, I'll move onto using Eddie's NN > platform, that ties into web cams and other complex software or input. Then, > we should have lots of real-world testing. BTW, with really fast learning, > MUCH larger models can be simulated on the same computers. >> >> So, I don't think progress on the propositional front directly >> translates to progress on the relational front, except in cases where >> we have astronomical amounts of data to prevent overmatching. > > > In a sense, dp/dt provides another dimension to sort things out. I am > hoping/expecting that LESS dp/dt data is needed this way than with other > competing methods. >> >> Moreover, we need something more than just markov models! > > > The BIG question is: Can we characterize what is needed? >> >> The transition to hidden-markov-model is not too difficult if we take >> the approach of hierarchical temporal memory; but this is still very >> simplistic. > > > Most, though certainly not all elegant solutions are simple. Is dp/dt (and > corollary methods) "it" or not? THAT is the question. >> >> Any thoughts about dealing with this? > > > Here, I am hung up on "this". Rather than respond in excruciating detail > with a presumption of "this", I'll make the following simplistic statement > to get this process started. > > Simple learning methods have not worked well for reasons you mentioned > above. The question here is whether dp/dt methods blow past those > limitations in general, and whether epineuronal methods blow past best in > particular. > > Are we on the same page here? > > Steve Richfield >> >> On Mon, Jan 5, 2009 at 12:42 PM, Steve Richfield >> <[email protected]> wrote: >> > Thanks everyone for helping me "wring out" the whole dp/dt thing. Now >> > for >> > the next part of "Steve's Theory..." >> > >> > If we look at learning as extracting information from a noisy channel, >> > in >> > which the S/N ratio is usually <<1, but where the S/N ratio is sometimes >> > very high, the WRONG thing to do is to engage in some sort of slow >> > averaging >> > process as present slow-learning processes do. This especially when >> > dp/dt >> > based methods can occationally completely separate (in time) the >> > "signal" >> > from the "noise". >> > >> > Instead, it would appear that the best/fastest/cleanest (from an >> > information >> > theory viewpoint) way to extract the "signal" would be to wait for a >> > nearly-perfect low-noise opportunity and simply "latch on" to the >> > "principal >> > component" therein. >> > >> > Of course there will still be some noise present, regardless of how good >> > the >> > opportunity, so some sort of successive refinement process using future >> > "opportunities" could further trim NN synapses, edit AGI terms, etc. In >> > short, I see that TWO entirely different learning mechanisms are needed, >> > one >> > to initially latch onto an approximate principal component, and a second >> > to >> > refine that component. >> > >> > Processes like this have their obvious hazards, like initially failing >> > to >> > incorporate a critical synapse/term, and in the process dooming their >> > functionality regardless of refinement. Neurons, principal components, >> > equations, etc., that turn out to be worthless, or which are "refined" >> > into >> > nothingness, would simply trigger another epineuronal reprogramming to >> > yet >> > another principal component, when a lack of lateral inhibition or other >> > AGI-equivalent process detects that something is happening that nothing >> > else >> > recognizes. >> > >> > In short, I am proposing abandoning the sorts of slow learning processes >> > typical of machine learning, except for use in gradual refinement of >> > opportunistic instantly-recognized principal components. >> > >> > Any thoughts? >> > >> > Steve Richfield >> > >> > ________________________________ >> > agi | Archives | Modify Your Subscription >> >> >> >> -- >> Abram Demski >> Public address: [email protected] >> Public archive: http://groups.google.com/group/abram-demski >> Private address: [email protected] >> >> >> ------------------------------------------- >> agi >> Archives: https://www.listbox.com/member/archive/303/=now >> RSS Feed: https://www.listbox.com/member/archive/rss/303/ >> Modify Your Subscription: https://www.listbox.com/member/?& >> Powered by Listbox: http://www.listbox.com > > ________________________________ > agi | Archives | Modify Your Subscription -- Abram Demski Public address: [email protected] Public archive: http://groups.google.com/group/abram-demski Private address: [email protected] ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b Powered by Listbox: http://www.listbox.com
