On 26/01/2016 09:51, Bert Verhees wrote:
On 26-01-16 10:38, Jan-Marc Verlinden wrote:
# Our first version was Java based with a postgres DB, everything stored as path/values. Every query would take about a second. We did not even try complex queries..:-). Also the GUI side did not know what to do with the pathvalues.
Hi Jan-Marc,

There where some problems handling the path/values, most problems were based on giving a semantic meaning to the paths. Storing path and an according a value is very, very quick. I asked database specialists, and they say this is the best way to go until billions of records.

this is also what I would expect. Path-based storage does rely on very smart ways to figure match terms in a query to paths of course. There are some tricks to use here. For example, the path to systolic BP DV_QUANTITY node from the archetype is

/data[id2|history|]/events[id7|any event|]/data[id4]/items[id5|Systolic|]/value

In the whole of CKM there are probably about 7,000 'interesting' leaf paths (if you assume that you crunch DATA_VALUE subtypes into little blobs). That's a tiny number. Assume that when they've modelled everything in medicine (outside of genomics and proteomics) that we have 50,000 such 'paths of interest'. That's a very small number. These paths can be mapped in smart ways to a 64-but number space so that finding out if a specific query term is in some EHR is very quick. When you include a coded list of archetype ids in the mix, I think querying can be made extremly quick.

The devil is in the details. Various large DBs used path-based approachs in the past, Informix was one.


Also easy to migrate to another database, for clustering or other reasons.

But there are some problems to solve, which were harder to solve five years ago.

One problem is the GUI builders, they are looking at a difficult to understand database-approach, and also easy to create errors in, hard to debug.
They need JSON to write their datasets in.

The other problem is querying. As long as it are predefined queries, you can do anything, but then you are no different from an old monolithic system.
But writing new templates heavily relies on on the fly query building

There are however, some technological progresses, also in the open source domain.

The path/value storage could come to a better life again with help of ANTLR, which can help to interpret AQL for this purpose. I even think this is promising.

Let engineers read the Definitive ANTLR4 Reference by Terence Parr, and read it with path/values in the back of the mind. Both the GUI problem as the query problem can be solved.

It should be worth the spent time and the price of the book ;-)


It is.

- thomas
_______________________________________________
openEHR-technical mailing list
openEHR-technical@lists.openehr.org
http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

Reply via email to