Validating an objecrt against its archetype

Thomas Beale Thu, 18 Aug 2005 15:06:59 +0100

Rong Chen wrote:

>>
>>
>> The main thing is that the insert_at_path() (or whatever you call it) 
>> call must check the relevant archetype node that the insertion is 
>> allowable, and it should fail if it is not. Your code should always 
>> know in advance what node this is, due to the choosing of archetypes 
>> beforehand (e.g. by the GUI forms).
>
>
> This should work for single object node insertion since most of the 
> object nodes with their runtime name are already created.
>
> But I start to wonder if runtime path is enough for data creation from 
> scratch, which requires:
> 1. path is unique so it can be used as key to group input values;
> 2. link to the original archetype node;


You are right - in my explanation, I did not try to be complete. There 
is a step of major importance which should be carried out by the 
assembled archetype (i..e after slot-filling has occurred) structure in 
memory, namely to generate a default data structure. This should be done 
by defining a method create_default() or similar, which can be called 
recursively down the archteype hierarchy; the effect of this method 
called on every C_OBJECT is to create the appropriate data instance e.g. 
an OBSERVATION or ELEMENT object. The overall result is that in the vast 
majority of cases, there is a reasoable (often complete, apart from 
values) data object to work with. Then modifications can proceed by 
using paths.

However, even when following this desing aproach, there are occasions 
when this is not enough. Imagine the archteype in question is the 
openEHR BP measurement archetype; in this archetype, there is a subtree 
called 'protocol', which has existence = 0..1, i..e optional. It would 
be reasonable for the create_default() method not to create this subtree 
at all in the data. But what if the user does want it (1 time out of 
50)? Then what the software has to do is to call create_default() on the 
protocl subtree of the BP archetype, and attach the result into the 
correct place in the main data structure created with the original 
create_default() call.

In general I would never expect data to be created completely from 
scratch - the only archetype which would justify this is the notional 
'any' archetype which allows absolutely anything.

During this data create/modify phase Inside the kernel, there are both 
data instances and archetype instances. Each data instance has to be 
connected logically to its corresponding archetype node. This could be 
effected by pointers/references (which is what we did in our GeHR kernel 
of 3 years ago) or could just be tracked logically by using paths (every 
data node has embedded in it a value for the attribute 
archetype_node_id, inherited from he LOCATABLE class).

>
> 1) is satisfied by using current runtime path,
> 2) is not fully satisfied, since runtime path consists of mainly data 
> node names (LOCATABLE.name()) and node names could either be the text 
> value in the local language of archetype_node_id code of the node or 
> explicitly set by the user. If more than one data node should be 
> created by the same archetype node, the name should include both the 
> text value and a modifier to make it unique.

Not 'could'; it must: it is a condition of correctness that no two 
sibling LOCATABLEs can ever have the same name value.

> The algorithm used for generating unique name modifier can be 
> predefined so it is possible to find the original archetype node. But 
> it will fail if the name is explicitly set by a user.

Automatic generation (e.g. of "_1", "_2", or "(1)", "(2)" etc) will work 
fine in many cases; but user setting has to be allowed as well. 
Everything will be fine as long as when the relevant insert() or 
append() method is called, that a uniqueness precondition is observed, 
as shown in the following signature:
    insert(new_item: LOCATABLE) is
       pre: not this.has(new_item)

where the 'has()' method compares LOCATABLE objects on the basis of 
their names.

>
> Besides, it is preferred to have archetype node id as the direct link 
> instead of some text values in local languages, which mainly is meant 
> to be used by humans.
>
> Perhaps a combination of both archetype id and some kind of modifier 
> based on simple algorithm, eg. a counter could be used to form the 
> path for object creation. It is bit like archetype path, but it is 
> unique and easier to process.
>
> Examples:
>
> Suppose the "occurrences" of an archetype node[at0004] is "0..*", 
> meaning from from zero to many nodes can be created by this archetype 
> node. The archetype path to this node is:
>
> /[at0001]/action[at0002]/representation[at0003]/items[at0004]/
>
> The paths used to bind input data to the archetype node, notice "-1" 
> and "-2" used as suffix of the archetype node id, which stands for the 
> first and second object node created from the same archetype node.
>
> /[at0001]/action[at0002]/representation[at0003]/items[at0004]-1/value/
> /[at0001]/action[at0002]/representation[at0003]/items[at0004]-2/value/

This is the right kind of idea, but not technically legal. The [] 
characters delimit the qualifier part of each path segment (i.e. the 
optional part required to differentiate the children of an attribute 
which is a List<T> or similar).  So any other characters which serve 
this purpose have to go inside it. In theory you would achieve this by 
doing [at0004-1]. However, in _archetype_ paths, the thing inside the [] 
has to be a legal 'at' code - 'at0004-1' is not. The underlying problem 
is that we are mixing lexical data (the "-1") with codes ('at0004'). In 
a runtime path on the other hand, the items inside the [] _are_ lexical; 
as Rong said above - they are the values of the codes, in the local 
language. You can legally add "-1" or whatever to these with no problem.

The downside may seem to be that runtime paths are language-dependent, 
and somehow not universal. This is indeed the case, and it is the 
intention: runtime paths are made of values chosen at runtime for the 
'name' attribute in data nodes; by definition this is in some language 
and relates to the local context. The archetype_node_ids are always 
there to allow universal, language-independent arhcetype paths to be 
re-constructed.

- thomas
-
If you have any questions about using this list,
please send a message to d.lloyd at openehr.org

Validating an objecrt against its archetype

Reply via email to