Re: openEHR draft Expression spec

Pieter Bos Thu, 19 May 2016 08:21:35 -0700

Hello Thomas,

I had already noticed the expressions part and based my experimental 
implementation on that. This email got quite long, so let’s start with a 
summary:


Summary:
- The current spec is quite similar to XPath. We can keep this even closer by 
referencing to the XPath specification in our specification in more places. It 
allows for tool reuse and resolves ambiguities in the specification.
- Some other problems/questions where found regarding to the spec, including 
grammar ambiguities and how to handle them and a question about node-ids that 
exist in the AOM, but not alway in the RM.

I have not implemented the full expression language yet, so I might find more, 
for example when I implement functions.

XPath and the relation to the expressions language:

Before i note my issues, I would like to point out I noticed the language is 
very similar to XPath. In fact, you can convert almost all of the expressions 
language to valid XPath 2.0-expressions with some simple steps:

  1.  Split into separate statements. For every statement:
  2.  Replace Apath shorthand notation with xpath: [id1] to [@archetype_node_id 
= ‘id1’], etc.
  3.  Replace symbolic form of operators with the textual form
  4.  Replace for_all … In … … with ‘every $var in /path satisfies …’
  5.  Replaces implies with ‘if … then …’
  6.  Replace exists(expression) with  count(expression) > 0

Then, get an Xpath implementation that works on your reference model, or just 
convert to XML first. Then for every assertion, evaluate the expression to a 
boolean. For every variable declaration, evaluate the expression to the type 
given in the variable declaration and store it under the given name.
Then implement the standard functions and variables. Functions and variables 
are part of standard Xpath, and so is defining your own.

If you do this, you just implemented full assertion support with very little 
effort and code, and very little chance of mistakes!

(If all you have is xpath 1, the for all and implies require manual handling. 
You might need to do a bit of extra work for some datatypes, especially 
terminology codes)

Having noticed this, i’m strongly in favour of keeping the syntax as close to 
Xpath as possible. This means we can reuse tools. Or, if you have reasons to 
write your own (I do, unfortunately), at least you can validate your 
implementations easily by testing against a known implementation.

So I would argue strongly in favour of keeping the $var syntax, because it is 
the same as the xpath-standard.

Some constructions in the expressions have a valid reason why they are 
different than Xpath, for example, the shorthand notation for archetype node 
ids really helps. I would say this could include the exists operator, because 
it expresses something that is often needed and expressing it explicitly allows 
for some really nice features in user interfaces.

However, I think this does not apply to the for_all and implies statements. If 
they could be replaced with the corresponding Xpath-syntax, I would think that 
is a good idea.

Problems in the specification

Here the problems I found in the spec so far:

Multiple-valued paths and type conversion:

  *   The spec does not say how to handle multiple-valued expressions, outside 
for_all statements. We could just follow the xpath-standard
  *   The spec says nothing about type conversion. We could just follow the 
xpath-standard.

Whitespace aware grammar

The current definition of the language needs a whitespace aware grammar. If 
not, the following is ambiguous:

$var:Integer ::= /path/to/value
/path/to/another/value > 3

Because there is no way to see which part of 
/path/to/value/path/to/another/value belongs to the first or second statement 
without considering whitespace in your parser. And that’s fine in a lexer, but 
harder to do in a parser – although still possible. Alternatively, it’s easily 
solved by demarcating your assertions, for example by requiring a ‘;’ after 
every assertion

The same problem happens in a second place:

for_all $var in /path /some/other/path > $var/subpath

This is actually even a bit hard to read for a human, because the space after 
/path is easily overlooked. Both the whitespace-awareness and the human 
readability could be easily solved by replacing for_all with the every .. In … 
satisfies syntax of xpath.

Node ids in archetype/reference model objects

In archetypes, some nodes have node ids, that have no node id in the 
corresponding reference model object. This is tricky, because a valid path to 
an archetype node, converted to Xpath, is NOT a valid path to the corresponding 
reference model objects. For example, the context attribute of a Composition is 
an EVENT_CONTEXT. This does not have an archetype node id. But it always has 
one in the ADL/AOM. So if you write the path /context[id2], you can convert it 
to Xpath as /composition/context[@archetype_node_id = ‘id2’]. But this will 
result in an empty node set, because there is no matching attribute called 
archetype_node_id. Instead, you could just write /context, which works.

So, there are several options to address this in the specification, for example:

  1.  Specify that paths to non-locatables should NOT have a [idx] predicate, 
even though the id in the archetype is present
  2.  Specify that paths to non-locatables can have a [idx] predicate, but it 
should be ignored in implementations

Option 2 is a harder to implement, because you can no longer convert from Apath 
to Xpath without knowledge of the model. But as Apath expressions are not new, 
I’m thinking some other people will have an opinion on this :)

Regards,

Pieter Bos








From: openEHR-technical 
<[email protected]<mailto:[email protected]>>
 on behalf of Thomas Beale 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>"
 
<[email protected]<mailto:[email protected]>>
Date: Thursday 19 May 2016 at 14:38
To: 
"[email protected]<mailto:[email protected]>"
 
<[email protected]<mailto:[email protected]>>
Subject: openEHR draft Expression spec


Pieter,

With respect to the 'rules' bit of ADL, and also GDL, there is a new draft 
'Expressions' spec in the BASE 
component<http://www.openehr.org/releases/BASE/latest/docs/index>. This is a 
working draft, and partly lifted from ADL/AOMs specs (those now just include 
this one), plus some extensions to show how rule extensions are done properly.

This spec proposes an improved syntax, but it's definitely not finished (e.g. I 
am thinking of getting rid of the $var style syntax), and it would be great to 
have some other collaborators on it who have a lot of experience with 
expressions / rules systems. So please have a look and feel free to comment - 
comments here probably make sense since others may be interested.

The draft of this spec will be released soon in a new release of the BASE 
component. All that means is that changes from then need to be documented by 
PRs and CRs in the normal fashion.

- thomas

On 19/05/2016 13:01, Pieter Bos wrote:

It certainly does validate specs. In fact, it already has caused some 
corrections to both the specs and the ANTLR-grammar.

And we already found a few more issues in the specs. I’ll soon file an issue 
report about the rules section, to specify how to handle operators on 
multiple-valued path expressions without a for_all :)

Pieter

From: openEHR-technical 
<[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>>
 on behalf of Sebastian Garde 
<[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>"
 
<[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>>
Date: Wednesday 18 May 2016 at 18:31
To: 
"[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>"
 
<[email protected]<mailto:[email protected]><mailto:[email protected]><mailto:[email protected]>>
Subject: AW: Archie version 0.1.0 released



_______________________________________________
openEHR-technical mailing list
[email protected]
http://lists.openehr.org/mailman/listinfo/openehr-technical_lists.openehr.org

Re: openEHR draft Expression spec

Reply via email to