Randolph Neall wrote:
>
>  
> Can I assume that what Thomas here advocates, ("relational databases can be
used very effectively as a low-level store of blobs keyed by path") is what
how the ocean persistence layer actually works? Beyond this, Thomas apparently
has little use for the capacities of Sql-type RDBMS systems to handle clinical
information. Does the Ocean system ultimately amount to blobs keyed by paths
(presumably string paths)? If so, what kind of blobs, XML blobs, or some other
structured text system?
>  
to clarify: yes, more or less. If you lock the relational schema, or even an
object schema (i.e. an object model expressed as classes and/or as an ODB
schema, say in ODL) directly to a model of the real world phenomena your
system deals with (e.g. patient visits, path results, GP notes, physical
examinations, referrals etc) then there will be permanent problems of
maintainability. This has been borne out for as long as I have known anything
about computers (let's say 20 years working + 5 years university, when Edition
2 of Somerville was our idea of 'software engineering').

I have kept wondering why software engineering talks about the problem of
maintenance and having to continually throw away and rebuild systems, the
problems of drifting away from requirements and so on, as if they were being
solved. But they are mostly not being solved. The technical book shelves are
full of books teaching the same old thing (count the number of information
system books using airline booking or hotel or conference booking case
studies), implying that it works. But in real life it doesn't. We don't seem
to have any sustainable information systems - we have to keep fiddling with
them. (Note: I am talking about 'information' systems here - there are many
other computational systems whose information is more or less static and which
mostly do number crunching or visualisation or some other job).

The root problem in my view is that if you build an information system in such
a way that its business logic and database encode the facts of the domain (as
gathered last week, by you and a few colleagues, perhaps following 'use case'
analysis or some such idea), the database and logic of the system are
connected to the reality being modelled. However, since reality keeps
changing, along with our idea of it (and hence our modelling of it), the
system is never correct; we have to kee modifying it. This might be easy with
a small system, but with large distributed systems and billions of records,
and numerous changing requirements, it doesn't perform well at all (altough we
often delude oursevles that it is ok by restricting our work practices to
those that fit the software).

In other words, directly modelling the aspects of reality we are concerned
with as first-order concepts in the software and database is a recipe for
costly and permanent maintenance. I believe we have to instead model the
reality as a second order concept, with first order concepts in the system
being stable models of the classes of things found in the reality of the
domain. Hence in openEHR we model only things like Composition ('recording'),
Section ('heading'), Party, various kinds of Entry like Observation and so on.
We don't model any medical or clinical thing directly. Everything modelled as
a first order concept is domain-invariant - it has the same meaning right
across teh entire domain of application. Of course we can debate whether we
have gotten it right or not, but this is the intention.

We are not the first to do something like this of course - there are hundreds
of solutions in a similar vein, various kinds of business rule modelling
languages and so on. What we do in openEHR is just one (fairly comprehensive,
we think) approach to solving the problem of unmaintainable software systems.
It also happens to help solve the problem of getting the requirements from the
domain experts - far better than 'use case analysis a la Jacobson' does.

It now seems quite clear to me that building any real world concepts directly
into the software infrastructure of an information system is a mistake. We can
do far better than that, and we always should aim to do so.

Having said all this, then there is the obvious question: well what do we use
the database for? My view on that (and there are far better experts than I) is
this: once you clear your head of any idea of trying to model patients,
visits, pathology results, prescriptions, medications, etc etc etc in the
database, you have an incredible freedom to use the power of these systems -
and most modern databases are extremely powerful. We just use them in the
wrong way a lot of the time. Using a relational database for medicine in my
view makes sense as long as you build a schema that has no first-order domain
concepts in it, and instead encodes information in a generic way. That could
be blobs, paths, indexable columns, or some other method. There are many
variations available, and I doubt if we have really started to understand them.

All these comments also apply to object models - encode the real phenomena
your system processes directly into the software classes of your system, and
you will have a never-ending game of catchup on your hands.

- thomas beale


Reply via email to