Dear all, I am a student of Prof Lu. I was working on the implementation of the ARM approach last year. I'd like to give some more information about our work.
As Ian McNicoll said, mapping archetype to relational database tables will generate sparse tables. In addition, archetypes can only specify slotted archetypes by wildcards, which may make it difficult to store data from slotted archetypes of templates since there are not tables were generated for slotted archetypes. Therefore, we use "storage templates", an openEHR template (OET) meant for storage purpose, to generate tables. By using the storage templates, columns that are not used will not be generated and slotted archetype can be specified too. As Prof Lu said, we do not convert RM into relational database tables. In fact, we convert RM into our mapping rules, i.e., DV_CODED_TEXT should be mapped to code_string, terminology_id and some other columns according to RM. We use these mapping rules to map templates into talbes. By doing this, we actually flatten a storage template into a one-dimensional structure (I'm not sure whether it could be described like this). The generated tables are quite similar to the tables current healthcare information system are using now. The mapping process: 1. Parse the storage template and archetypes that are used in the template to get a fixed archetype object model 2. Generate objects from the fixed archetype object model according to the mapping rules automatically 3. Use ORM framework to map the objects into relational database. It's true that the schema will grow when the number of storage templates grow. But the schema are generated automatically so that there is no need to maintain them. We only have to maintain the archetypes and templates. We also implement the AQL based on the ARM approach. Since we have flattened the template, the transformed SQL are quite few and much like what the current healthcare information system are using now. We have tested its performance. When the size of the return result is small, there is no big difference between with AQL query and SQL query. When the size of the return result grows, the gap between AQL query and SQL grows. It is because AQL query need to assemble the result into dADL format that costs a lot of time. We are working on it now to optimize the AQL performance. Best regards Jacky Lau From: openehr-technical-request Date: 2016-01-26 02:26 To: openehr-technical Subject: openEHR-technical Digest, Vol 47, Issue 26 Send openEHR-technical mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of openEHR-technical digest..." Today's Topics: 1. Re: Archetype relational mapping - a practical openEHR persistence solution (Bert Verhees) 2. Re: Archetype relational mapping - a practical openEHR persistence solution (Seref Arikan) 3. Re: Archetype relational mapping - a practical openEHR persistence solution (Jan-Marc Verlinden) ---------------------------------------------------------------------- Message: 1 Date: Tue, 26 Jan 2016 10:51:31 +0100 From: Bert Verhees <[email protected]> To: [email protected] Subject: Re: Archetype relational mapping - a practical openEHR persistence solution Message-ID: <[email protected]> Content-Type: text/plain; charset="windows-1252"; Format="flowed" On 26-01-16 10:38, Jan-Marc Verlinden wrote: > # Our first version was Java based with a postgres DB, everything stored > as path/values. > Every query would take about a second. We did not even try complex > queries..:-). Also the GUI side did not know what to do with the > pathvalues. Hi Jan-Marc, There where some problems handling the path/values, most problems were based on giving a semantic meaning to the paths. Storing path and an according a value is very, very quick. I asked database specialists, and they say this is the best way to go until billions of records. Also easy to migrate to another database, for clustering or other reasons. But there are some problems to solve, which were harder to solve five years ago. One problem is the GUI builders, they are looking at a difficult to understand database-approach, and also easy to create errors in, hard to debug. They need JSON to write their datasets in. The other problem is querying. As long as it are predefined queries, you can do anything, but then you are no different from an old monolithic system. But writing new templates heavily relies on on the fly query building There are however, some technological progresses, also in the open source domain. The path/value storage could come to a better life again with help of ANTLR, which can help to interpret AQL for this purpose. I even think this is promising. Let engineers read the Definitive ANTLR4 Reference by Terence Parr, and read it with path/values in the back of the mind. Both the GUI problem as the query problem can be solved. It should be worth the spent time and the price of the book ;-) best regards and good luck Bert Verhees -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.openehr.org/pipermail/openehr-technical_lists.openehr.org/attachments/20160126/403735cb/attachment-0001.html> ------------------------------ Message: 2 Date: Tue, 26 Jan 2016 09:52:49 +0000 From: Seref Arikan <[email protected]> To: For openEHR technical discussions <[email protected]> Subject: Re: Archetype relational mapping - a practical openEHR persistence solution Message-ID: <ca+4thdpj5sdty_n-ccn3acbvnz1yqa72-sewvrl2wa6kukp...@mail.gmail.com> Content-Type: text/plain; charset="utf-8" Hi Jan-Marc, To clarify: when you say huge: do you mean that the result set is huge, or the the amount of existing data is huge? Are you able to comment on how query performance changes/stays the same when the result set size begins to grow? On Tue, Jan 26, 2016 at 9:38 AM, Jan-Marc Verlinden < [email protected]> wrote: > I believe your paper is about performance of an openEHR based system with > a relational database. It does not argues if this is the right approach. > And specifically on the performance we could not agree more...:-). > > In the past year we worked on three different versions, all completely > openEHR compliant. We compared them, so I believe we are entitled to > discuss. Let's see: > > 1. Our first version was Java based with a postgres DB, everything > stored as path/values. > Every query would take about a second. We did not even try complex > queries..:-). Also the GUI side did not know what to do with the > pathvalues. > 2. Second version was completely XML based, lots of Java with an Exist > DB. This version can be found in GitHub. > Results: a single query over 900 records took 100ms. But the scary > part was that performance exploded linear, so 90000 would take about 10 > seconds, even for the most simple (already indexed and optimized) query. > 3. Our current stack converts archetypes to an Object model and > persists this model to postgres. Indeed this needs some fancy tricks > (@Thomas), but it's doable...:-). > Performance is comparable with the findings in the paper, even with > huge queries. Performance is steady at about 1 to 3ms per query. No > optimization done yet, is not yet needed but could/will make it even > faster. > > Cheers, Jan-Marc > > Op di 26 jan. 2016 om 09:55 schreef ??? <[email protected]>: > >> Dear all, >> >> I just found there are lots of comments on our recent paper. Thanks for >> all these comments, I will read them and reply later. >> >> @Shinji, Of course the slides could be shared. >> >> Regards >> Xudong >> >> 2016-01-26 10:44 GMT+08:00 Shinji KOBAYASHI <[email protected]>: >> >>> Hi Ian and all, >>> >>> We, openEHR Japan had an unconference with Dr Lu and he gave us a >>> presentation for us about this research. >>> I will ask him if the slides would be shareble. >>> >>> Shinji KOBAYASHI >>> >>> 2016-01-25 22:04 GMT+09:00 Ian McNicoll <[email protected]>: >>> >>>> Interesting paper from China >>>> >>>> >>>> http://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-015-0212-0 >>>> >>>> Ian >>>> >>>> Dr Ian McNicoll >>>> mobile +44 (0)775 209 7859 >>>> office +44 (0)1536 414994 >>>> skype: ianmcnicoll >>>> email: [email protected] >>>> twitter: @ianmcnicoll >>>> >>>> Co-Chair, openEHR Foundation [email protected] >>>> Director, freshEHR Clinical Informatics Ltd. >>>> Director, HANDIHealth CIC >>>> Hon. Senior Research Associate, CHIME, UCL >>>> >>>> _______________________________________________ >>>> openEHR-technical mailing list >>>> [email protected] >>>> >>>> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org >>>> >>> >>> >>> _______________________________________________ >>> openEHR-technical mailing list >>> [email protected] >>> >>> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org >>> >> >> _______________________________________________ >> openEHR-technical mailing list >> [email protected] >> >> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org > > -- > > Jan-Marc Verlinden > MedVision (mobile) > > *MedVision BV* > Aagje Dekenkade 71 > 2251 ZV, Voorschoten > www.medvision360.com > > This e-mail message is intended exclusively for the addressee(s). Please > inform us immediately if you are not the addressee. > > _______________________________________________ > openEHR-technical mailing list > [email protected] > > http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.openehr.org/pipermail/openehr-technical_lists.openehr.org/attachments/20160126/143aa6c1/attachment-0001.html> ------------------------------ Message: 3 Date: Tue, 26 Jan 2016 10:25:52 +0000 From: Jan-Marc Verlinden <[email protected]> To: For openEHR technical discussions <[email protected]> Subject: Re: Archetype relational mapping - a practical openEHR persistence solution Message-ID: <CAAwigtU7EfLNECGtaD=V4rSVt9gOZz9x+BxmqTW=yJr9=-d...@mail.gmail.com> Content-Type: text/plain; charset="utf-8" Interesting discussion..:-). Concerns for our current architecture: - Should be completely according openEHR IM - Accessible for external developers (front-end) through an API. - KISS; hide all openEHR complexity for GUI developers. So no pathvalues output, just simple GET and PUT. - Scalable, should be able to handle large queries and large datasets without any (huge) challenges. - Providing REST access (Json documents) - Access with Javadocs Some options we compared: - Exist DB is not ACID https://en.wikipedia.org/wiki/ACID. So in fact not usable at all. Also performance sucks.. (we tested). - Oracle rdbms is indeed ACID. You have to convert the archetype (ADL or XML) to XSD, that will be used to store the XML in a relational DB (for better performance). - Like Bert already stated; MarkLogic is somewhat pricey..;-), but -could- provide better performance. We never tested. So we came back to good old postgres and store the archetype Object model into a relational DB straight forward. None of the compared systems could even come close in matters of performance and scalability. In the end we -could- provide XML access alongside with current Json, but no GUI developer asked for it..:-). Other papers that show the same: - Performance of XML Databases for Epidemiological Queries in Archetype-Based EHRs. By Sergio Miranda Freire, Erik Sundvall, Daniel Karlsson and Patrick Lambrix - AN OPENEHR REPOSITORY BASED ON A NATIVE XML DATABASE. By Linda Velte, Tiago Pedrosa, Carlos Costa and Jos? Lu?s Oliveira. Thanks for all input, Jan-Marc Op di 26 jan. 2016 om 10:38 schreef Jan-Marc Verlinden < [email protected]>: > I believe your paper is about performance of an openEHR based system with > a relational database. It does not argues if this is the right approach. > And specifically on the performance we could not agree more...:-). > > In the past year we worked on three different versions, all completely > openEHR compliant. We compared them, so I believe we are entitled to > discuss. Let's see: > > 1. Our first version was Java based with a postgres DB, everything > stored as path/values. > Every query would take about a second. We did not even try complex > queries..:-). Also the GUI side did not know what to do with the > pathvalues. > 2. Second version was completely XML based, lots of Java with an Exist > DB. This version can be found in GitHub. > Results: a single query over 900 records took 100ms. But the scary > part was that performance exploded linear, so 90000 would take about 10 > seconds, even for the most simple (already indexed and optimized) query. > 3. Our current stack converts archetypes to an Object model and > persists this model to postgres. Indeed this needs some fancy tricks > (@Thomas), but it's doable...:-). > Performance is comparable with the findings in the paper, even with > huge queries. Performance is steady at about 1 to 3ms per query. No > optimization done yet, is not yet needed but could/will make it even > faster. > > Cheers, Jan-Marc > > Op di 26 jan. 2016 om 09:55 schreef ??? <[email protected]>: > >> Dear all, >> >> I just found there are lots of comments on our recent paper. Thanks for >> all these comments, I will read them and reply later. >> >> @Shinji, Of course the slides could be shared. >> >> Regards >> Xudong >> >> 2016-01-26 10:44 GMT+08:00 Shinji KOBAYASHI <[email protected]>: >> >>> Hi Ian and all, >>> >>> We, openEHR Japan had an unconference with Dr Lu and he gave us a >>> presentation for us about this research. >>> I will ask him if the slides would be shareble. >>> >>> Shinji KOBAYASHI >>> >>> 2016-01-25 22:04 GMT+09:00 Ian McNicoll <[email protected]>: >>> >>>> Interesting paper from China >>>> >>>> >>>> http://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-015-0212-0 >>>> >>>> Ian >>>> >>>> Dr Ian McNicoll >>>> mobile +44 (0)775 209 7859 >>>> office +44 (0)1536 414994 >>>> skype: ianmcnicoll >>>> email: [email protected] >>>> twitter: @ianmcnicoll >>>> >>>> Co-Chair, openEHR Foundation [email protected] >>>> Director, freshEHR Clinical Informatics Ltd. >>>> Director, HANDIHealth CIC >>>> Hon. Senior Research Associate, CHIME, UCL >>>> >>>> _______________________________________________ >>>> openEHR-technical mailing list >>>> [email protected] >>>> >>>> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org >>>> >>> >>> >>> _______________________________________________ >>> openEHR-technical mailing list >>> [email protected] >>> >>> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org >>> >> >> _______________________________________________ >> openEHR-technical mailing list >> [email protected] >> >> http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org > > -- > > Jan-Marc Verlinden > MedVision (mobile) > -- Jan-Marc Verlinden MedVision (mobile) -- *MedVision BV* Aagje Dekenkade 71 2251 ZV, Voorschoten www.medvision360.com This e-mail message is intended exclusively for the addressee(s). Please inform us immediately if you are not the addressee. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.openehr.org/pipermail/openehr-technical_lists.openehr.org/attachments/20160126/95a78791/attachment.html> ------------------------------ Subject: Digest Footer _______________________________________________ openEHR-technical mailing list [email protected] http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org ------------------------------ End of openEHR-technical Digest, Vol 47, Issue 26 *************************************************
_______________________________________________ openEHR-technical mailing list [email protected] http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

