Steve, Well, I *still* think you are wasting your time with "flat" (propositional) learning. I'm not saying there isn't still progress to be made in this area, but I just don't see it as an area where progress is critical. The main thing that we can do with propositional models when we're dealing with relational data is construct markov-models. Markov models are highly prone to overmatching the dataset when they become high-order. So far as I am aware, improvements to propositional models mainly improve performance for large numbers of variables, since there isn't much to gain with only a few variables. (FYI, I don't have much evidence to back up that claim.) So, I don't think progress on the propositional front directly translates to progress on the relational front, except in cases where we have astronomical amounts of data to prevent overmatching.
Moreover, we need something more than just markov models! The transition to hidden-markov-model is not too difficult if we take the approach of hierarchical temporal memory; but this is still very simplistic. Any thoughts about dealing with this? --Abram Demski On Mon, Jan 5, 2009 at 12:42 PM, Steve Richfield <[email protected]> wrote: > Thanks everyone for helping me "wring out" the whole dp/dt thing. Now for > the next part of "Steve's Theory..." > > If we look at learning as extracting information from a noisy channel, in > which the S/N ratio is usually <<1, but where the S/N ratio is sometimes > very high, the WRONG thing to do is to engage in some sort of slow averaging > process as present slow-learning processes do. This especially when dp/dt > based methods can occationally completely separate (in time) the "signal" > from the "noise". > > Instead, it would appear that the best/fastest/cleanest (from an information > theory viewpoint) way to extract the "signal" would be to wait for a > nearly-perfect low-noise opportunity and simply "latch on" to the "principal > component" therein. > > Of course there will still be some noise present, regardless of how good the > opportunity, so some sort of successive refinement process using future > "opportunities" could further trim NN synapses, edit AGI terms, etc. In > short, I see that TWO entirely different learning mechanisms are needed, one > to initially latch onto an approximate principal component, and a second to > refine that component. > > Processes like this have their obvious hazards, like initially failing to > incorporate a critical synapse/term, and in the process dooming their > functionality regardless of refinement. Neurons, principal components, > equations, etc., that turn out to be worthless, or which are "refined" into > nothingness, would simply trigger another epineuronal reprogramming to yet > another principal component, when a lack of lateral inhibition or other > AGI-equivalent process detects that something is happening that nothing else > recognizes. > > In short, I am proposing abandoning the sorts of slow learning processes > typical of machine learning, except for use in gradual refinement of > opportunistic instantly-recognized principal components. > > Any thoughts? > > Steve Richfield > > ________________________________ > agi | Archives | Modify Your Subscription -- Abram Demski Public address: [email protected] Public archive: http://groups.google.com/group/abram-demski Private address: [email protected] ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b Powered by Listbox: http://www.listbox.com
