A paper that will interest you: (preliminary version) http://citeseer.nj.nec.com/cache/papers/cs/15339/http:zSzzSzwww.cs.arizona.e duzSzpeoplezSztodszSzacceptedzSz2000zSzParsonsEmancipating.pdf/parsons00eman cipating.pdf
(published) http://portal.acm.org/citation.cfm?id=357778&coll=portal&dl=ACM&CFID=2131136 &CFTOKEN=70981949 Abstract: "Database design commonly assumes, explicitly or implicitly, that instances must belong to classes. This can be termed the assumption of inherent classification. We argue that the extent and complexity of problems in schema integration, schema evolution, and interoperability are, to a large extent, consequences of inherent classification. Furthermore, we make the case that the assumption of inherent classification violates philosophical and cognitive guidelines on classification and is, therefore, inappropriate in view of the role of data modeling in representing knowledge about application domains." Also, a search for 'semantic interoperability' should return some interesting hits. To tell the difference between two (or three) sequences of bytes is not too difficult; comparing two sequences A and B to determine their longest common subsequence (LCS) or the edit distance between them has been much studied. GNU diff is based on an algorithm published by Eugene W. Myers in 1986. To tell the difference (distance) between two semantic structures is difficult in a very fundamental way. Kind regards Peter Ring -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Glew, Andy Sent: 13. maj 2002 19:32 To: [EMAIL PROTECTED]; Glew, Andy Cc: Gary Bisaga Subject: RE: merge mode for XML > > Motivation: schema changes in most existing relational databases are > > onerous. > > For very good reason. And what is that reason? OK, I admit that some RDBMS applications in production need stability - just like some systems software applications (the kind Greg seems to work on, the kind I used to work on) value stability above all else, and actively want to make it hard to change things. However, there are other application domains - in programming, the domains attacked by agile methodologies like XP (eXtreme Programming). {Donning asbestos underwear, expecting Greg to flame.} An application area that I frequently work in nowadays is experimental databases - databases for experimental data. I want to archive all of my experimental data in a form that allows me to do arbitrary SQL-like queries over it. Problem is, as I continue my research, the format of my records is continually changing. For example, a few years ago I might have recorded CPU MHz and Cache Size as configuration parameters - now I have to record at least 3 different cache sizes, as well as multiple clock domain frequencies. Not to mention that the observations that I record are constantly changing. Rather than continually reformatting my database, adding new fields which are "Unknown" or "Null" on old data, I find it easier to add records containing fields that were not known earlier. <snip /> _______________________________________________ Info-cvs mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/info-cvs
