>From the Hume's Guillotine README
<https://github.com/jabowery/HumesGuillotine>:

The reason you keep all "errors in measurement" -- the reason you avoid
lossy compression -- is to avoid what is known as "confirmation bias" or,
what might be called "Ockham's Chainsaw Massacre". Almost all criticisms of
Ockham's Razor boil down to mischaracterizing it as Ockham's Chainsaw
Massacre. The remaining criticisms of Ockham's Razor boil down to the claim
that those selecting the data never include data that doesn't fit their
preconceptions. That critique may be reasonable but it is not an argument
against the Algorithmic Information Criterion, which only applies to a
given dataset. Models and data are different. Therefore model selection
criteria are qualitatively different from data selection criteria.

Yes, people can and will argue over what data to include or exclude -- but
the Algorithmic Information Criterion traps the intellectually dishonest by
making their job much harder since they must include exponentially much
more data that is biased towards their particular agenda in order to wash
out data coherence (and interdisciplinary consilience) in the rest of the
dataset. The ever-increasing diversity of data sources identifies the
sources of bias -- and then starts predicting the behavior of data sources
in terms of their bias, as bias. Trap sprung! This is much the same
argument as that leveled against conspiracy theories: At some point it
becomes simply impractical to hide a lie against the increasing diversity
of observations and perspectives.

On Tue, Nov 25, 2025 at 9:39 PM James Bowery <[email protected]> wrote:

>
>
> On Mon, Nov 24, 2025 at 9:05 AM Matt Mahoney <[email protected]>
> wrote:
>
>> ...
>> Which raises the even bigger problem that as you mentioned, motivation,
>> ego, and money drive science. Scientists who should know better still want
>> to prove themselves right...
>>
>
> This holds also for scientists who want to prove that it is hopeless to
> hold them to account with an objective model selection criterion.
>
> Not only is that motivation enormous, it requires almost no motivation at
> all since those in power can't be held to account by those without power --
> so, even if they are so foolish as to engage the powerless in argument,
> they can make BS arguments respond to any counter-argument with more BS.
> This is being automated with LLMs on a mass scale now that Turing's BS test
> has been passed.
>
>
>> Suppose you want to answer the question of whether covid-19 vaccines are
>> safe and effective...
>>
>
> That's not what large models are for.  Large models either answer an
> enormous range of questions effectively because they have an effective
> world model or they are narrow pre-programmed small models that do
> simulations based on human expert specifications; merely encoding prior
> expert knowledge in simulation algorithms.
>
> The data set huge.
>
> As I said, there is a huge difference between the data that go into
> climate model and the data that go into macrosocial psychology models such
> as those upon which you base your argument in the OP.
>
>
>> ...Do you trust the US CDC? Do you trust the Chinese CDC? Do you trust
>> Turkmenistan, the only country to report zero cases throughout the
>> pandemic? Who gets to decide which data to include?
>>
>
> Data and models are in different categories therefore data selection
> criteria and model selection criteria are in different categories.  I
> addressed this in the README at
> https://github.com/jabowery/HumesGuillotine
>
>
>> How do you convince people who believe that the moon landing was fake?
>>
>
> You don't.  What you do is convince decisionmakers to take information
> criteria for model selection seriously enough to apply algorithmic
> information theory.
>
> As to the uncomputability of proving one has found the best possible
> scientific model for a given dataset leading to a potentially bottomless
> pit of resources being poured down the science rat hole:  Precisely!
> That's why funding authorities need criteria that holds those receiving the
> science funding objectively accountable and in such a manner that they
> don't have to worry about leaked evaluation datasets.
>
> -- Matt Mahoney, [email protected]
>>
>> On Sun, Nov 23, 2025, 10:30 AM James Bowery <[email protected]> wrote:
>>
>>> There are, of course, an infinite number of "arguments" one can come up
>>> with to expand what Nick Szabo calls the "Argument Surface" and that is
>>> where the real "problem for statistics about people" arises -- not in the
>>> choice of language ambiguity.  People who are not motivated to get rid of
>>> motivated reasoning will not be motivated to solve problems like the choice
>>> of language ambiguity -- as just one example of many.  I will grant,
>>> however, that particular redoubt is only for the elect who, like you and I,
>>> have been involved with judging the Hutter Prize.  IIRC, even Shane Legg
>>> sets forth that argument as a reason to avoid the ALgorithmic Information
>>> Criterion -- and you can't get much more authoritative than that unless you
>>> go to Hutter himself or, in the hypothetical case, Solomonoff.  I did
>>> express concern to Marcus at one time, when Solomonoff was still living and
>>> shortly after the Hutter Prize had been announced, that Solomonoff might
>>> "torpedo" the Hutter Prize with that argument (if I recall the exact
>>> wording).  Marcus reassured me that Solomonoff would do no such thing.
>>> IIRC shortly thereafter Solomoff posted something like that argument to his
>>> blog.  IIRC Marcus objected to using the ALIC for global warming despite
>>> the Biden administration setting the value of addressing that issue at
>>> around $10T/year -- and I can see merit in that objection given the scale
>>> of the data.
>>>
>>> But it all comes down to "incentives" when we are addressing the
>>> "motivated reasoning" problem and that's why I posted my Congressional
>>> testimony about the "incentives" regarding rocket technology -- which you
>>> commented on but did not seem to get the point I was trying to make about
>>> incentives.
>>>
>>> Once we're in the realm of macrosocial psychological dynamical models,
>>> the incentives are so great as to beggar the imagination.  This is far
>>> greater even than Biden's rNPV of $10T/year and the macrosocial psychology
>>> data is many orders of magnitude smaller than climate data.  That said,
>>> there is room for your concern about choice of language in conjunction with
>>> the identification "noise" regarding which, as I've often pointed out:
>>> "one man's noise is another man's cyphertext".
>>>
>>> So we have two "argument surfaces" here:
>>>
>>> How much of the macrosocial dataset is "*noise*" as opposed to
>>> inadequately motivated forensic epistemology "decyphering" that noise?
>>>
>>> How much of the wiggle room for *choice of language *can be squeezed
>>> out by forensic epistemology motivated by an rNPV of $10T/year, ie: well in
>>> excess of $100T, with let's say only 1% of that amount going to ALIC
>>> research: >$1T?
>>>
>>> First of all, recognize that the exploit you regard is decisive
>>> is miniscule compared to the argument surface presently not only tolerated
>>> but exploited by the academy, think tanks and punditry.  At present there
>>> is virtually nothing BUT macrosocial psychological "argument surface", e.g.
>>> arguments such as the one to which you appealed for normative alignment of
>>> young men to be optimistic lest their pessimism be a self fulfilling
>>> prophecy.
>>>
>>> Secondly, forensic epistemology is precisely about *presuming* criminal
>>> behavior such as that to which you appeal as a reason for despair.  With
>>> >$1T at stake there will be enormous motivation to suss out issues
>>> regarding "language choice" and I can easily demonstrate that none of the
>>> existing authorities have been sufficiently motivated to reduce that aspect
>>> of the argument surface:
>>>
>>> As I've pointed out before, not only is there an entirely different
>>> theoretical basis for addressing that reason (really excuse) to support
>>> avoidance of  scientific accountability by our policy makers (ie: NiNOR
>>> Complexity), but there are obvious, at-hand, techniques to reduce that
>>> argument surface.   For example, a GPU provides an "instruction set", ie
>>> "language", that is radically different from a CPU.  So are we to now throw
>>> up our hands in despair and let those in power get away with "Well gee who
>>> could have KNOWN???" when things don't go "according to projections"?
>>> Really?  Why am I the ONLY person to have addressed the *obvious* fact
>>> that a GPU's "instruction set" is describable as a relatively tiny
>>> procedure in a canonical instruction set and that procedure's algorithmic
>>> length should be used?
>>>
>>> Could it be that, perhaps, I'm the only sufficiently MOTIVATED person
>>> among those who have been taking information criteria remotely seriously?
>>>
>>>
>>> On Thu, Nov 20, 2025 at 5:27 PM Matt Mahoney <[email protected]>
>>> wrote:
>>>
>>>> On Thu, Nov 20, 2025, 10:11 AM James Bowery <[email protected]> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 19, 2025 at 11:19 AM Matt Mahoney <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Algorithmic information or compression is great for evaluating
>>>>>> language models but not for everything....
>>>>>>
>>>>>> I could try compressing world population data by fitting it to a
>>>>>> polynomial,
>>>>>>
>>>>>
>>>>> Do you understand the difference between statistics and dynamics?
>>>>>
>>>>
>>>> No, it's the difference between compressing text and compressing video.
>>>> You can't accurately measure the compression of a tiny signal in a sea of
>>>> noise.
>>>>
>>>> This becomes a problem for statistics about people. It only takes a few
>>>> bits of Kolmogorov complexity for social scientists to construct models
>>>> that favor one group over another, and those bits can be hidden in the
>>>> choice of language ambiguity.
>>>>
>>>> I think it would be great if we could answer political questions
>>>> objectively. So how would you solve the problem?
>>>>
>>>>
>>>>> <https://agi.topicbox.com/groups/agi/T504adacb23f3c455-Md49fd5f054dbc9f5d8062388>
>>>>>
>>>> -- Matt Mahoney, [email protected]
>>>>
>>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
>> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
>> participants <https://agi.topicbox.com/groups/agi/members> +
>> delivery options <https://agi.topicbox.com/groups/agi/subscription>
>> Permalink
>> <https://agi.topicbox.com/groups/agi/T504adacb23f3c455-M99fe6983586de890e5b7d816>
>>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T504adacb23f3c455-Ma8b3587b16f635f4070446a8
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to