Randolph Neall wrote: > > > Can I assume that what Thomas here advocates, ("relational databases can be used very effectively as a low-level store of blobs keyed by path") is what how the ocean persistence layer actually works? Beyond this, Thomas apparently has little use for the capacities of Sql-type RDBMS systems to handle clinical information. Does the Ocean system ultimately amount to blobs keyed by paths (presumably string paths)? If so, what kind of blobs, XML blobs, or some other structured text system? > to clarify: yes, more or less. If you lock the relational schema, or even an object schema (i.e. an object model expressed as classes and/or as an ODB schema, say in ODL) directly to a model of the real world phenomena your system deals with (e.g. patient visits, path results, GP notes, physical examinations, referrals etc) then there will be permanent problems of maintainability. This has been borne out for as long as I have known anything about computers (let's say 20 years working + 5 years university, when Edition 2 of Somerville was our idea of 'software engineering').
I have kept wondering why software engineering talks about the problem of maintenance and having to continually throw away and rebuild systems, the problems of drifting away from requirements and so on, as if they were being solved. But they are mostly not being solved. The technical book shelves are full of books teaching the same old thing (count the number of information system books using airline booking or hotel or conference booking case studies), implying that it works. But in real life it doesn't. We don't seem to have any sustainable information systems - we have to keep fiddling with them. (Note: I am talking about 'information' systems here - there are many other computational systems whose information is more or less static and which mostly do number crunching or visualisation or some other job). The root problem in my view is that if you build an information system in such a way that its business logic and database encode the facts of the domain (as gathered last week, by you and a few colleagues, perhaps following 'use case' analysis or some such idea), the database and logic of the system are connected to the reality being modelled. However, since reality keeps changing, along with our idea of it (and hence our modelling of it), the system is never correct; we have to kee modifying it. This might be easy with a small system, but with large distributed systems and billions of records, and numerous changing requirements, it doesn't perform well at all (altough we often delude oursevles that it is ok by restricting our work practices to those that fit the software). In other words, directly modelling the aspects of reality we are concerned with as first-order concepts in the software and database is a recipe for costly and permanent maintenance. I believe we have to instead model the reality as a second order concept, with first order concepts in the system being stable models of the classes of things found in the reality of the domain. Hence in openEHR we model only things like Composition ('recording'), Section ('heading'), Party, various kinds of Entry like Observation and so on. We don't model any medical or clinical thing directly. Everything modelled as a first order concept is domain-invariant - it has the same meaning right across teh entire domain of application. Of course we can debate whether we have gotten it right or not, but this is the intention. We are not the first to do something like this of course - there are hundreds of solutions in a similar vein, various kinds of business rule modelling languages and so on. What we do in openEHR is just one (fairly comprehensive, we think) approach to solving the problem of unmaintainable software systems. It also happens to help solve the problem of getting the requirements from the domain experts - far better than 'use case analysis a la Jacobson' does. It now seems quite clear to me that building any real world concepts directly into the software infrastructure of an information system is a mistake. We can do far better than that, and we always should aim to do so. Having said all this, then there is the obvious question: well what do we use the database for? My view on that (and there are far better experts than I) is this: once you clear your head of any idea of trying to model patients, visits, pathology results, prescriptions, medications, etc etc etc in the database, you have an incredible freedom to use the power of these systems - and most modern databases are extremely powerful. We just use them in the wrong way a lot of the time. Using a relational database for medicine in my view makes sense as long as you build a schema that has no first-order domain concepts in it, and instead encodes information in a generic way. That could be blobs, paths, indexable columns, or some other method. There are many variations available, and I doubt if we have really started to understand them. All these comments also apply to object models - encode the real phenomena your system processes directly into the software classes of your system, and you will have a never-ending game of catchup on your hands. - thomas beale