Being able to derive differential equations from incomplete and noisy data
is certainly useful for compression. To use a simple example, suppose I
observe a mass bouncing on a spring at position x and derive the equation
x'' = -x, whose solution is a sinusoid. From this, I can predict all the
past and future values of x from just 2 observations. But how does this
tell me that the spring causes the mass to move when it doesn't even tell
me there is a spring?

Or to use your example, we observe a correlation between urbanization and
deforestation. How do I know which causes the other? And does it matter for
compression?

On Wed, Aug 9, 2023, 2:15 PM James Bowery <[email protected]> wrote:

> Aside from the fact that the Ref.zip metadata shows the years associated
> with the column identifiers, and the contestant may therefore include in
> the compressed representation that temporal information if that lowers the
> compressed representation, consider the case where longitudinal
> measurements (ie: time-sequence data) are presented without any metadata at
> all, let alone metadata that specifies a temporal dimension to any of the
> measurements.
>
> If these data are from a dynamical system, application of dynamical
> system identification <https://www.nature.com/articles/s41467-021-26434-1>
> will minimize the size of the compressed representation by specifying the
> boundary condition and system of differential equations.  This is not
> because there is a "time" dimension anywhere, except in the implicit
> dimension across which differentials are identified.
>
> Let's further take the utterly atemporal data case where a single year
> snapshot is taken across a wide range of counties (or other geographic
> statistical area) on a wide range of measures.  It may still make sense to
> identify a dynamical system where processes are at work across time that
> result in spatial structures at different stages of progression of that
> system.  Urbanization is one such obvious case.  Deforestation is another.
> There will be covariants of these measures that may be interpreted as
> caused by them in the sense of a latent temporal dimension.
>
> On Tue, Aug 8, 2023 at 5:23 PM Matt Mahoney <[email protected]>
> wrote:
>
>> ...
>> I see that BMLiNGAM is based on the LINGAM model of causality, so I found
>> the paper on LINGAM by Shimizu. It extends Pearl's covariance matrix model
>> of causality to non Gaussian data. But it assumes (like Pearl) that you
>> still know which variables are dependent and which are independent.
>>
>> But a table of numbers like LaboratoryOfTheCounties doesn't tell you
>> this. We can assume that causality is directional from past to future, so
>> using an example from the data, increasing 1990 population causes 2000
>> population to increase as well. But knowing this doesn't help compression.
>> I can just as easily predict 1990 population from 2000 population as the
>> other way around.
>>
>> As a more general example, suppose I have the following data over 3
>> variables:
>>
>> A B C
>> 0 0 0
>> 0 1 0
>> 1 0 1
>> 1 1 1
>>
>> I can see there is a correlation between A and C but not B. I can
>> compress just as well by eliminating column A or C, since they are
>> identical. This does not tell us whether A causes C, or C causes A, or both
>> are caused by some other variable.
>>
>> What would be an example of determining causality with generic labels?
>>
>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M6e82e89e73fc2d713315846f>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M468fd525195a72eec26de8a8
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to