Hi All, Today was "Stress Test Day" for my Python ADL 1.4. parser. The goal was to see how well I could parse ADL 1.4 in order to create a data structure that can be used for building in memory Python archetype objects from the reference model so they can be persisted in an archetype repository for local use.
I used almost all of the archetypes in SVN control on openEHR.org. The exceptions were the test ones involving automotive information and the duplicates between dev-uk-nhs, dev-uk-nhs-scotland and dev-nl-tno. The dev-uk-nhs tree took precedence. I'll probably do another test later on just the dev tree depending on any feedback I get. I think that my findings are interesting for the following groups of people: 1) Archetype tools developers 2) Archetype tools users 3) Clinical governance members 4) Archetype consumers 5) Python openEHR developers ... therefore, my reason for cross-posting. Of the 755 ADL files tested, 216 of them failed parsing. These failures are a combination of: a) missing required sections such as 'archetype', 'definition', 'language', etc. b) missing structure closings such as '}' and '>' (these very well could be errors in the parser itself?) c) unexpected characters in certain locations (I still have to evaluate these to see if they are legal or not in the spec) The main reason I think this is interesting enough to cross post is that several ADL files pass parsing between version numbers. My question is; were they edited with different versions of the same tool or different tools or was the output just somehow different? I will propose that the description section of the archetype carry an attribute that identifies the tool and version number used for creation/editing. While all of these ADL files are listed as being in the development tree. I hope that this information may inform the Clinical Governance group about differences in files and especially about gross missing items. I also note that some of the ADL files are intended to be tests. My feeling is that some of these have been around for some time and may not be truly 'test' files. Should they be removed or updated? I vote for updated. A set of really good test ADL files are essential and will really simplify the work of openEHR developers. I have uploaded (see link below) the following files for information: ADLParserTest.txt -- shows the console output of all the files tested. parser_errors.log-1st -- shows the files that failed along with why and the line # and col # The *.osh file is mostly for Python developers. It shows the nested list structure of the parsed ADL. adl_1_4.py -- the parser parsertester.py -- calls the parser and logs errors. tested_adl_files.zip -- the 755 ADL files. The dev-tree_adl_files.zip and parser_errors.log-2nd are the input/output from the ADL files in the dev tree (knowledge/archetypes/dev/*). 69 Failed and at first glance many are the same ones that failed before and for the same reasons. I want to also thank Paul McGuire for writing Pyparsing and for working so hard on this ADL parser. **********IMPORTANT***************************************************** The ADL files in this archive MUST not be used for any other purpose. They are now outside of the openEHR knowledge management framework and cannot be relied upon for any other use. ************************************************************************ Download the ParserTestResults package https://sourceforge.net/project/showfiles.php?group_id=152993 Sincerely, Tim -- Timothy Cook, MSc Health Informatics Research & Development Services LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook Skype ID == timothy.cook ************************************************************** *You may get my Public GPG key from popular keyservers or * *from this link http://timothywayne.cook.googlepages.com/home* ************************************************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20080506/21f766ae/attachment.asc>

