By Alex A. Freitas (Computing Laboratory, University of Kent)
This paper is a critical review of the literature on discovering
comprehensible, interesting knowledge (or patterns) from data. The
motivation for this review is that the majority of the literature
focuses only on the problem of maximizing the accuracy of the
discovered patterns, ignoring other important pattern-quality
criteria that are user-oriented, such as comprehensibility and
interestingness. The word "interesting" has been used with several
different meanings in the data mining literature. In this paper
interesting essentially means novel or surprising. Although
comprehensibility and interestingness are considerably harder to
measure in a formal way than accuracy, they seem very relevant
criteria to be considered if we are serious about discovering
knowledge that is not only accurate, but also useful for human
decision making. The paper discusses both data-driven methods (based
mainly on statistical properties of the patterns) and user-driven
methods (which take into account the user's background knowledge or
believes) for discovering interesting knowledge. Data-driven methods
are discussed in more detail because they are more common in the
literature and are more controversial. The paper also suggests future
research directions in the discovery of interesting knowledge.
<http://www.pantaneto.co.uk/issue30/Freitas.htm>Link
--
Posted By johannes to
<http://www.monochrom.at/english/2008/05/are-we-really-discovering-interesting.htm>monochrom
at 5/21/2008 03:06:00 PM