Being able to derive differential equations from incomplete and noisy data is certainly useful for compression. To use a simple example, suppose I observe a mass bouncing on a spring at position x and derive the equation x'' = -x, whose solution is a sinusoid. From this, I can predict all the past and future values of x from just 2 observations. But how does this tell me that the spring causes the mass to move when it doesn't even tell me there is a spring?
Or to use your example, we observe a correlation between urbanization and deforestation. How do I know which causes the other? And does it matter for compression? On Wed, Aug 9, 2023, 2:15 PM James Bowery <[email protected]> wrote: > Aside from the fact that the Ref.zip metadata shows the years associated > with the column identifiers, and the contestant may therefore include in > the compressed representation that temporal information if that lowers the > compressed representation, consider the case where longitudinal > measurements (ie: time-sequence data) are presented without any metadata at > all, let alone metadata that specifies a temporal dimension to any of the > measurements. > > If these data are from a dynamical system, application of dynamical > system identification <https://www.nature.com/articles/s41467-021-26434-1> > will minimize the size of the compressed representation by specifying the > boundary condition and system of differential equations. This is not > because there is a "time" dimension anywhere, except in the implicit > dimension across which differentials are identified. > > Let's further take the utterly atemporal data case where a single year > snapshot is taken across a wide range of counties (or other geographic > statistical area) on a wide range of measures. It may still make sense to > identify a dynamical system where processes are at work across time that > result in spatial structures at different stages of progression of that > system. Urbanization is one such obvious case. Deforestation is another. > There will be covariants of these measures that may be interpreted as > caused by them in the sense of a latent temporal dimension. > > On Tue, Aug 8, 2023 at 5:23 PM Matt Mahoney <[email protected]> > wrote: > >> ... >> I see that BMLiNGAM is based on the LINGAM model of causality, so I found >> the paper on LINGAM by Shimizu. It extends Pearl's covariance matrix model >> of causality to non Gaussian data. But it assumes (like Pearl) that you >> still know which variables are dependent and which are independent. >> >> But a table of numbers like LaboratoryOfTheCounties doesn't tell you >> this. We can assume that causality is directional from past to future, so >> using an example from the data, increasing 1990 population causes 2000 >> population to increase as well. But knowing this doesn't help compression. >> I can just as easily predict 1990 population from 2000 population as the >> other way around. >> >> As a more general example, suppose I have the following data over 3 >> variables: >> >> A B C >> 0 0 0 >> 0 1 0 >> 1 0 1 >> 1 1 1 >> >> I can see there is a correlation between A and C but not B. I can >> compress just as well by eliminating column A or C, since they are >> identical. This does not tell us whether A causes C, or C causes A, or both >> are caused by some other variable. >> >> What would be an example of determining causality with generic labels? >> >> *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M6e82e89e73fc2d713315846f> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M468fd525195a72eec26de8a8 Delivery options: https://agi.topicbox.com/groups/agi/subscription
