On Wed, Aug 9, 2023 at 3:02 PM Matt Mahoney <[email protected]> wrote:
> Being able to derive differential equations from incomplete and noisy data > is certainly useful for compression. To use a simple example, suppose I > observe a mass bouncing on a spring at position x and derive the equation > x'' = -x, whose solution is a sinusoid. From this, I can predict all the > past and future values of x from just 2 observations. But how does this > tell me that the spring causes the mass to move when it doesn't even tell > me there is a spring? > Although I can answer this in three evasions (1: posit that the spring has non-negligible mass so its characteristics must be imputed for more accurate predictions, 2: classical laws of motion are time-symmetric so "causality" is meaningless, 3: who cares about the spring so long as we're predicting what we care about?) dynamical systems (such as reality) are not immune to entropy increase -- which is a good operational definition of information hence the arrow of time hence causality. Although there are familiar processes which appear to exhibit time reversed entropy (such as described in my late colleague, Tom Etter's so-named paper <https://web.archive.org/web/20051226123820/http://www.boundaryinstitute.org/articles/ProcRevTime_1960.pdf>) these are not so common as to confound the very idea of causality. BTW: Tom and Solomonoff apparently both arrived early at the 1956 Dartmouth AI Summer -- but he never mentioned Solomonoff to me. He did, however, work with me on the notion of causality being latent in the data without explicit temporality. Indeed, it was my interest in finding a formal foundation for programming languages that dealt with time in a principled manner got me to hire him at HP for the Internet Chapter 2 project there. You might find his paper with SLAC physicist Pierre Noyes of interest. <https://arxiv.org/abs/quant-ph/9808011> Or to use your example, we observe a correlation between urbanization and > deforestation. How do I know which causes the other? And does it matter for > compression? > The social sciences are, like the rest of the environmental sciences, riddled with such riddles. The answer is in the degree to which information from one set of measurements provides information about another set of measurements compared with vis versa. Consider conditional compressibility. Why isn't this obvious? > > On Wed, Aug 9, 2023, 2:15 PM James Bowery <[email protected]> wrote: > >> Aside from the fact that the Ref.zip metadata shows the years associated >> with the column identifiers, and the contestant may therefore include in >> the compressed representation that temporal information if that lowers the >> compressed representation, consider the case where longitudinal >> measurements (ie: time-sequence data) are presented without any metadata at >> all, let alone metadata that specifies a temporal dimension to any of the >> measurements. >> >> If these data are from a dynamical system, application of dynamical >> system identification >> <https://www.nature.com/articles/s41467-021-26434-1> will minimize the >> size of the compressed representation by specifying the boundary condition >> and system of differential equations. This is not because there is a >> "time" dimension anywhere, except in the implicit dimension across which >> differentials are identified. >> >> Let's further take the utterly atemporal data case where a single year >> snapshot is taken across a wide range of counties (or other geographic >> statistical area) on a wide range of measures. It may still make sense to >> identify a dynamical system where processes are at work across time that >> result in spatial structures at different stages of progression of that >> system. Urbanization is one such obvious case. Deforestation is another. >> There will be covariants of these measures that may be interpreted as >> caused by them in the sense of a latent temporal dimension. >> >> On Tue, Aug 8, 2023 at 5:23 PM Matt Mahoney <[email protected]> >> wrote: >> >>> ... >>> I see that BMLiNGAM is based on the LINGAM model of causality, so I >>> found the paper on LINGAM by Shimizu. It extends Pearl's covariance matrix >>> model of causality to non Gaussian data. But it assumes (like Pearl) that >>> you still know which variables are dependent and which are independent. >>> >>> But a table of numbers like LaboratoryOfTheCounties doesn't tell you >>> this. We can assume that causality is directional from past to future, so >>> using an example from the data, increasing 1990 population causes 2000 >>> population to increase as well. But knowing this doesn't help compression. >>> I can just as easily predict 1990 population from 2000 population as the >>> other way around. >>> >>> As a more general example, suppose I have the following data over 3 >>> variables: >>> >>> A B C >>> 0 0 0 >>> 0 1 0 >>> 1 0 1 >>> 1 1 1 >>> >>> I can see there is a correlation between A and C but not B. I can >>> compress just as well by eliminating column A or C, since they are >>> identical. This does not tell us whether A causes C, or C causes A, or both >>> are caused by some other variable. >>> >>> What would be an example of determining causality with generic labels? >>> >>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-M468fd525195a72eec26de8a8> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T772759d8ceb4b92c-Mc28c883438adfa3de80d9427 Delivery options: https://agi.topicbox.com/groups/agi/subscription
