We establish causation by controlled experiments. If you want to test if X
causes Y, then you vary X and observe Y while keeping everything else
the same. The two problems with analyzing data sets by compression are that
the other conditions are not all the same and that there may be conditions
that affect Y that are not in the data set.

We do not know why the US has had an epidemic of obesity and diabetes since
the 1980's. First we were told to avoid fats. Then we were told to avoid
carbs. Neither worked. Could it be because fewer people smoke? In China and
Eastern Europe, everyone smokes and nobody is overweight. Doesn't nicotine
suppress appetite? Or maybe it's something else. What does your data set
say?

We do not know why skin cancer rates have been rising since the 1980s,
about the time that sunscreens were introduced. Could sunscreens cause
cancer (by increasing exposure to UVA and total UV by blocking the tanning
effect of UVB)? I don't think that dermatologists would deliberately lie to
us. All the research is public. What does your data set say?

Ray Kurzweil was at one point taking 100 life extension supplements at a
cost of $1 million per year so he could live to see the singularity at 100
and become immortal. But there are exactly zero supplements shown to extend
life. How would you test them? Randomly assign babies to take either an
experimental drug or a placebo every day of their lives and wait 75 years?
It's now illegal to do these tests even on chimpanzees, and other primates
are next.

And why are we still debating adding fluoride to drinking water after 70
years? Why are we still debating vaccine safety? I suppose there is no help
for people who prefer to get their data from right wing conspiracy videos
on YouTube than from algorithmic information theory. But that's an AI
problem too. We train AI to tell us what we want to hear, and it obliges.

So yeah, I agree it can be done, but there are a lot of practical obstacles.

-- Matt Mahoney, [email protected]

On Sun, Dec 21, 2025, 5:54 PM James Bowery <[email protected]> wrote:

>
>
> On Sun, Dec 21, 2025 at 3:59 PM Matt Mahoney <[email protected]>
> wrote:
>
>> On Sun, Dec 21, 2025, 3:05 PM James Bowery <[email protected]> wrote:
>>
>>>
>>> We're almost there, again, Matt.  Ask not what I would do with this
>>> information, ask why we don't have this information in the first place.
>>>
>>
>> Because the information we want is causation, and compression only tells
>> you about correlation.
>>
>
> Every high school physics student knows that even systems as simple 3-body
> gravitational interaction cannot be described by correlation.  It requires
> going beyond Shannon or Rissannen or any other noise from the statistics
> world.  It requires feedback.  Although some might claim that all it
> requires in a discrete and finite universe is a finite state machine, not a
> UTM, it does at least require that much.
>
> There's a lot of work going on in the area of dynamical systems
> identification from measurement data.
>
> But I hear you about "you can't know what causes what".  This is *always*
> the argument trotted out when people in power stop losing their ability to
> impose their theories of causality on others and start being challenged by
> scientists.
>
> Back in the days of the 30 Years War it was all about which theocracy's
> "miracles" were permitted to vitiate causal laws.  Nowadays, it may not be
> so much about "miracles" as simple truth claims about the futility of
> resistance to impersonal forces that are completely impervious to agency.
> People in power and those who identify with them like to trot that one out
> whenever there is an argument about policy interventions.
>
> Like I said, we're there again only on a global scale with powers that
> dwarf those available at the dawn of artillery.  I'd really like to avoid
> having to go through that again.
>
>
>
>>
>> We can easily compress a table of global statistics to find a negative
>> correlation between economic development and fertility. But that doesn't
>> say which causes the other.
>>
>> The problem with using AI is that people upvote answers they agree with,
>> rather than the correct answers. I'm not ready to outsource my brain yet.
>>
>>>
>>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-M21862e03ac1b394666ee1761>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-M73835b4eb85bc92c0aa4603f
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to