Re: [Gnumed-devel] Approaches to maintain clinical data uptime

Tim Churches Sun, 30 Apr 2006 00:32:52 -0700

Syan Tan wrote:
> couldn't you file a request for a academic replication system , like a gossip 
> architecture system ?


Um, file a request with whom? Academics don't do anything without being
paid for it, these days.

> BTW,  I'm not quite clear about why lamport clocks as opposed to vector 
> clocks 
> are used ;
> 
> a lamport clock is just one sequence number for one site, which is kept 
> ordered 
> whenever
> 
> sites send messages to each other. Vector clocks are sequence numbers kept at 
> every site about
> 
> every site , so when messages are received , changes can be causally ordered 
> between more
> 
> than one other site . What sort of ordering is being aimed for the netepi 
> multi-site application and why ?

Sorry - I said "some variation on Lamport clocks" by which I meant a
vector or logical clock, as you describe - they all grew out of the
original Lamport idea, I believe. Causal ordering is the aim. Multiple
flu clinics during a  flu pandemic - a person may present to more than
one clinic, and clinics may have intermittent or unreliable connections.

Tim C

> *On Sun Apr 30 9:06 , Tim Churches sent:
> 
> *
> 
>     James Busser wrote:
>      > On Apr 29, 2006, at 4:35 AM, Tim Churches wrote:
>      >
>      >> (I keep wondering whether we should have used an EAV pattern for 
> storage
>      >
>      > Educated myself (just a bit) here
>      >
>      >
>     
> http://www.health-itworld.com/newsitems/2006/march/03-22-06-news-hitw-dynamic-data
>     
> <parse.pl?redirect=http%3A%2F%2Fwww.health-itworld.com%2Fnewsitems%2F2006%2Fmarch%2F03-22-06-news-hitw-dynamic-data>
>      >
>      > http://www.pubmedcentral.gov/articlerender.fcgi\?artid=61439
>     
> <parse.pl?redirect=http%3A%2F%2Fwww.pubmedcentral.gov%2Farticlerender.fcgi%3Fartid%3D61439>
>      > https://tspace.library.utoronto.ca/handle/1807/4677
>     
> <parse.pl?redirect=https%3A%2F%2Ftspace.library.utoronto.ca%2Fhandle%2F1807%2F4677>
>      > http://www.jamia.org/cgi/content/abstract/7/5/475
>     
> <parse.pl?redirect=http%3A%2F%2Fwww.jamia.org%2Fcgi%2Fcontent%2Fabstract%2F7%2F5%2F475>
> 
>     Thanks - we have copies of the latter three papers but I hadn't seen the
>     first article. Of course, PostGreSQL muddies the waters, because the way
>     it works under the bonnet (hood, engine cover) is rather similar to (but
>     not identical) to the EAV model - but all that is hidden behind the SQL
>     interface which is not easy to bypass.
> 
>     We really wanted to use openEHR when we started in 2003 - openEHR can
>     been seen as a very sophisticated metadata layer which can be used with
>     an EAV-like back-end storage schema - but no openEHR storage engines
>     were available then, and when I asked again earlier this year, there
>     were still none available (as open source or closed source on a
>     commercial basis) in a production-ready form.
> 
>     Anyway, plain old PostgreSQL tables work rather well, and are fast and
>     reliable for large datasets - but we will need to build our own
>     replication engine, I now think. What we really need is multi-master DB
>     replication which can cope with slow and unreliable networks (hence it
>     has to use asyncrhonous updates, not tightly-coupled synchronous updates
>     such as multi-phase commits) and with frequent "network partition". If
>     we are funded to do that, then we'll write it in Python, probably using
>     a stochastic "epidemic" model for the data propagation algorithm and
>     some variation on Lamport logical clocks for data synchronisation. It
>     als needs to propagate schema changes. Hopefully if we can make it
>     sufficiently general so it might have utility for GNUmed eg when a copy
>     of a clinic database is taken away on a laptop for use in the field eg
>     at a nursing home or a satellite clinic, and network connection and
>     synchronisation only occurs occasionally. However, we need the
>     replication to scale to 200 to 300 sites. Interestingly, most of the
>     commercial multi-master database replication products just gloss over
>     the issue of data integrity, or leave it up to the application - but
>     research in the 1990s showed that that is not good enough in more
>     complex situations with more than a few master DB instances.
> 
>      >> - Slony would have worked with that..).
> 
>     There is a Slony-2 project, being done here in Sydney, but it is
>     focussing on multi-master synchronous updates ie multiple servers in a
>     single data centre, for load-balancing of write tasks as well as read
>     tasks (for which Slony-1 can be used to facilitate load-balancing)
> 
>     Sorry to rave on, but don't let anyone tell you that there are some
>     fundamental data management issues yet to be addressed by open source or
>     commercial software.
> 
>     Tim C
> 
> 
> 
> 
>     _______________________________________________
>     Gnumed-devel mailing list
>     [email protected]
>     <javascript:top.opencompose('[email protected]','','','')>
>     http://lists.gnu.org/mailman/listinfo/gnumed-devel
>     
> <parse.pl?redirect=http%3A%2F%2Flists.gnu.org%2Fmailman%2Flistinfo%2Fgnumed-devel>
> 
> 



_______________________________________________
Gnumed-devel mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/gnumed-devel

Re: [Gnumed-devel] Approaches to maintain clinical data uptime

Reply via email to