Re: [agi] Re: Information MetaCriterion

Matt Mahoney Thu, 21 Nov 2019 17:53:11 -0800

I believe the deeper question is why Occam's Razor works so well across
every branch of science. What makes Solomonoff induction the core principle
of machine learning? Why is it so successful at prediction in every domain?
Why are theories that require fewer words or symbols to describe more
likely to be correct? We only know empirically that it works.


I believe the theoretical reason is that all possible probably
distributions over any infinite set of strings with p > 0 must favor
shorter strings over longer ones. For any possible theory or model encoded
as a string, there are an infinite number of longer and less likely strings
but only a finite number of longer and more likely strings.

On Thu, Nov 21, 2019, 7:40 PM James Bowery <[email protected]> wrote:

> I will agree however that if there is a principled way to include both
> time and space as constraints in a model selection criterion it makes
> pragmatic sense at the very least because what one is trying to do is
> predict which by definition is in time.
>
> On Thursday, November 21, 2019, James Bowery <[email protected]> wrote:
>
>> If I can spawn a finite but unlimited number of parallel processes in
>> "space", I can compute AIXItl, for example.  So let's say the generating
>> space is projected down into 3D space + time -- it is approximated by time,
>> correct?  In other words, once you admit "space" as a computation
>> dimension, don't you beg the question?
>>
>> On Thu, Nov 21, 2019 at 6:06 PM TimTyler <[email protected]> wrote:
>>
>>> On 2019-11-21 11:46:AM, James Bowery wrote:
>>> > The point of my conjecture is that there is a very good reason to
>>> > select "the smallest executable archive of the data" as your
>>> > information criterion over the other information criteria -- and it
>>> > has to do with the weakness of "lossy compression" as model selection.
>>> 
>>> That, along with a number of other entries in the list is a "space-only"
>>> criterion.
>>> 
>>> It seems reasonable that runtime duration,as well as program complexity
>>> is a
>>> 
>>> factor for most real-world data. As well as being generated by a small
>>> system,
>>> 
>>> observed data was probably generated in a limited time. Space-time
>>> metrics
>>> 
>>> are clearly needed. I think we can reject any alleged superiority of any
>>> 
>>> space-only metric.
>>> 
>>> --
>>> __________
>>> |im |yler http://timtyler.org/
>>> 
>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/T0fc0d7591fcf61c5-Mb804f1a891a4b3448b9b54be>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T0fc0d7591fcf61c5-M78668f96f817c0f61a1d7b23
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] Re: Information MetaCriterion

Reply via email to