Hi Thomas, *this is also what I would expect. Path-based storage does rely on very smart ways to figure match terms in a query to paths of course.*
Did you test, or is this theoretical? Jan-Marc Op di 26 jan. 2016 om 11:36 schreef Thomas Beale <[email protected]>: > > > On 26/01/2016 09:51, Bert Verhees wrote: > > On 26-01-16 10:38, Jan-Marc Verlinden wrote: > > > - Our first version was Java based with a postgres DB, everything > stored as path/values. > Every query would take about a second. We did not even try complex > queries..:-). Also the GUI side did not know what to do with the > pathvalues. > > Hi Jan-Marc, > > There where some problems handling the path/values, most problems were > based on giving a semantic meaning to the paths. > Storing path and an according a value is very, very quick. I asked > database specialists, and they say this is the best way to go until > billions of records. > > > this is also what I would expect. Path-based storage does rely on very > smart ways to figure match terms in a query to paths of course. There are > some tricks to use here. For example, the path to systolic BP DV_QUANTITY > node from the archetype is > > /data[id2|history|]/events[id7|any > event|]/data[id4]/items[id5|Systolic|]/value > > In the whole of CKM there are probably about 7,000 'interesting' leaf > paths (if you assume that you crunch DATA_VALUE subtypes into little > blobs). That's a tiny number. Assume that when they've modelled everything > in medicine (outside of genomics and proteomics) that we have 50,000 such > 'paths of interest'. That's a very small number. These paths can be mapped > in smart ways to a 64-but number space so that finding out if a specific > query term is in some EHR is very quick. When you include a coded list of > archetype ids in the mix, I think querying can be made extremly quick. > > The devil is in the details. Various large DBs used path-based approachs > in the past, Informix was one. > > > > Also easy to migrate to another database, for clustering or other reasons. > > But there are some problems to solve, which were harder to solve five > years ago. > > One problem is the GUI builders, they are looking at a difficult to > understand database-approach, and also easy to create errors in, hard to > debug. > They need JSON to write their datasets in. > > The other problem is querying. As long as it are predefined queries, you > can do anything, but then you are no different from an old monolithic > system. > But writing new templates heavily relies on on the fly query building > > There are however, some technological progresses, also in the open source > domain. > > The path/value storage could come to a better life again with help of > ANTLR, which can help to interpret AQL for this purpose. I even think this > is promising. > > Let engineers read the Definitive ANTLR4 Reference by Terence Parr, and > read it with path/values in the back of the mind. Both the GUI problem as > the query problem can be solved. > > It should be worth the spent time and the price of the book ;-) > > > It is. > > > - thomas > _______________________________________________ > openEHR-technical mailing list > [email protected] > > http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org -- Jan-Marc Verlinden MedVision (mobile) -- *MedVision BV* Aagje Dekenkade 71 2251 ZV, Voorschoten www.medvision360.com This e-mail message is intended exclusively for the addressee(s). Please inform us immediately if you are not the addressee.
_______________________________________________ openEHR-technical mailing list [email protected] http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

