I would say a statistical inference is less generative than a theory. A theory in some sense asserts how things really work. Data mining may stumble across the crucial aspects of a mechanism (whether it is physical, sociological, etc.) but they may also just being seeing some derived quantity of other hidden variables. Perhaps there _is_ a reason why tying shoes one way or another is related to some mode of cognitive processing that is more efficient? Or maybe it arises because some parts of the country, folks tend to have that habit, and those parts of the country happen to cleaner water or less air pollution or better schools or have social constraints in their communities that lead individuals to navigate authoritarianism better than others?
I think that data mining could be elaborated (and automated) to begin to create theories. For example, if a regression had an especially simple form that was also predictive, describe the variables with some ontology that says why they ought to relate in a deterministic fashion. Instead of just “the weather will be rainy tomorrow”, report “the weather will be rainy tomorrow because there is a low pressure system coming in the from the west”, and then reference mathematical models for how weather systems behave, etc. From: Friam [mailto:[email protected]] On Behalf Of Eric Charles Sent: Friday, September 09, 2016 9:31 AM To: The Friday Morning Applied Complexity Coffee Group <[email protected]> Subject: Re: [FRIAM] speaking of analytics Marcus, That's an interesting distinction. Is it the case that by "theory" Nick was referring to something verbal and explicitly metaphorical, or would the results of data mining, which one sought to validate on a different sample, count as a "theory". So, for example, if my data mining of Marine data found that tying shoes left-to-right predicted success at Officer Candidate School, and I then went to test for that "prediction" in a later sample of incoming officer candidates, to what extent is my prediction based on "a theory". Of course, "data mining will be a useful way to uncover patterns" is itself a theory, applicable in some domains but not others (i.e., not all domains of inquiry will contain the sought after patterns in a long-term stable form). Eric ----------- Eric P. Charles, Ph.D. Supervisory Survey Statistician U.S. Marine Corps On Fri, Sep 9, 2016 at 10:51 AM, Marcus Daniels <[email protected]<mailto:[email protected]>> wrote: “I know that theories are really useful for making predictions, but can one actually make a prediction without one?” Yes, that’s what data mining is: Take a large corpus of data, find some statistically rare relationships, and then test for their predictive value on another large corpus of data. In this way one can predict things without really having any kind of theory or even domain knowledge. Marcus ============================================================ FRIAM Applied Complexity Group listserv Meets Fridays 9a-11:30 at cafe at St. John's College to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
============================================================ FRIAM Applied Complexity Group listserv Meets Fridays 9a-11:30 at cafe at St. John's College to unsubscribe http://redfish.com/mailman/listinfo/friam_redfish.com
