> It's a little funny when a paper on defining simplicity is a highly complex > read... :) I was holding off saying it to let others say it first.
Below I summarized (!) his Paper after reading most of it. I play with 3 things throughout it, not that 3 is special, but just saying so you can read it easier. So it seems his Paper is saying Occam's Razor starts with the ugly "computing any possible thing you could compute on a computer - any physics, object, or event". Then 1) we use the razor to shorten the possibles so that shorter are more likely the answer. Then, you say, hey, 2) we can get rid of some short possibles too, physics that we don't have! Because computers can compute different physics, we must tell them to ignore those thoughts. Then, 3) of all the things you can find in our universe, there is Patterns, so we can again get rid of even more short possibles! - We can forget or ignore storing/using "cats seek food" and just store a representation "animals desire survival". I could keep going, narrowing down choices.....4) 5) 6).... Recent/related strings are more likely to be said again. Etc etc. But now this is just how intelligence skips Brute Force. So, now, the razor, well there's many razors that make a brain, as I ended off saying above, each help chop off unlikely answers. I know 10 mechanisms to AGI that do this, and they all use FREQUENCIES of features in data. If there wasn't a re-occurence of a atomic structure or law or text word/sentence in physics, the universe would be random and non-predictable. So the original Razor in Occam's Razor is 'simple is better', and maybe even things that occur in our physics, and representations. And, well, more than 3 razors as said... So, is Occam's Razor unique and why does it work? Our original question. No it's just AGI/ physics, let me explain. We model data, that we see, so we get the representations and correct physics based answers (we do actually see new physics in video games, but we ignore that mostly). Then, as for the 3rd thing left now - simple & shorter & faster is better, well, more-frequent features are seen, used, and work with each other more than larger ones and smaller ones, so the middle zone, not too complex, but not too simple either, is key to attention. The whole universe causes this. Short execution and code complexity is just the sweet spot. It's all, data based. Based n frequencies and combinational heterarchy-ism and hierarchy-ism, as Ben said. So, to say it again, the original razor "simple but not too simple", is really "short/small and fast but not too short/small and fast", and what it is is the middle zone, if you take a hierarchy/ heterarchy and look at text (or 3D atomic) structures, the relational context and frequency / cooperation happen most in the lower layers but not the lowest either. For example, you'll rarely ever see a identical 400 word long sentence, so it isn't able to see enough context and be "used" with others, or even exist! While, if we look at low layers like how many times does "a" or "the" occur in text, they occur tons, and have much context, but there is so few laws at this level, only 26 letters, 10 numbers, etc. The issue is that larger structures DO occur, and have relationships, that we need be concerned about in our analysis, but at some higher layer the relationships cease to exist. So, because we need to watch those higher layers, we see oh, crap, there is so many features in those layer 2 to 30, even though it ceases by layer ~30, there is so many, and so most of our life problems and solutions are in that middle zone where a wide range but finite range of same-layer structures exist on higher complexities, most problems are not so simple to work with that all you need is to move atom A to location B. For example driving a car or building hard drives is much harder! Requires many small structures to work together to "feel" simple. So...this "simple but not too simple" razor, is drawing our Attention to middle layer features/ complexity. Reason? Physics has justĀ few laws/ alphabet, and they interact, larger laws get evolved, and most sit there, while is finite. Whew. BTW, I was thinking recently of throwing more: Compute, more accelerated compute (neural computers), more data, more razors, at AGI. Of course they're all razors. But it's interesting that it is the data for all of the razors that raze off search space. Accelerated chips, those make nets work faster, it's not data driven, but you get more data processed so I would call that "more data" or "data driven". ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T37756381803ac879-Me466161aafe25933f6b7601d Delivery options: https://agi.topicbox.com/groups/agi/subscription
