{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-16 Thread Andrew Patterson
 [Heath Frankel]
 I understand your point here but if we cannot have some kind of schema
 migration mechanism we will need a new schema per release, which is
 something that I don't think anyone wants.

Yes - I don't want that either. I bring it up because if we are allowing
minor schema changes as well as major ones, we need two versioning
mechanisms.

For the major releases the schema namespace is the indicator.

For minor releases, we have no information about what minor release
the instance was designed for, unless we mandate a mechanism
such as the one you suggested with rm_version in archetype_details.

 [Heath Frankel]
 So your suggesting moving to a date oriented namespace so that there is no
 tie to the release number?

Well, the tie would be through documentation - the 1.1.3 release would say
something like - we are using the 2009/01 schemas. It then puts them
on a separate release trajectory. Of course, this would only be useful
if we think there will actually be divergence between the XML schema
changes needed, and the major releases needed.

 [Heath Frankel]
 All archetyped locatables can have this and I would expect all top-level
 objects to be archetyped otherwise they have no domain semantics.

What about actual ARCHETYPE objects? Some tools or systems may
persist the xml serialization of the AOM rather than the ADL. Same
with templates. I'm saying that every XML instance that might ever be
in the wild using the openehr schemas should have some clear mechanism
to tie it to the actual schema files that it _strictly_ conforms to.

 [Heath Frankel]
 I can see the utility of this, but I am not sure that we should mandate it,
 seems a bit of a hack.

It does feel hackish. But it is the only attributes you can add to an xml
instance without needing to change schema, and the xsi:schemaLocation
field is actually roughly designed for this purpose.

The other option is to add a mandatory fixed meta attribute for Composition,
Archetype etc that is explicitly in the XML ITS, that holds the full
schema release version number.

Andrew



{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-15 Thread Andrew Patterson
 The question is, what do we do when we do have a data breaking schema
 change, like potentially in r1.1?  I suggest that we just go with
 http://schemas.openehr.org/v1.1 and we can assume the old
 http://schemas.openehr.org/v1 meant r1.0.x.  It wasn't expected to have such
 a substantial schema change until r2 but I guess that is the reality of
 software.

I am also concerned about 'non-breaking' changes though - well, it depends
on your definition of a breaking change - but say we add an
optional element to a xml type? This doesn't break any existing data
because all instances will still comply with the new schema - however,
new instances will now potentially not comply with the old schema. Do
we consider these breaking changes? Are we expecting to create a
new namespace if we do this type of change?

 Jumping ship to another style such as http://schemas.openehr.org/2009/03
 would also be reasonable, it would just mean we have to correlate release
 dates with release numbers.

This has one advantage in that it is possible to release new major
versions of the spec _without_ updating the schema. So if v1.2 needs
no schema changes over v1.1, the 1.2 released schemas can be left as
http://schemas.openehr.org/2009/03 - and people won't be confused
thinking that their data isn't the right 'version'.

 There is an optional rm_version in the archetype_details attached to any
 archetyped locatable.  We currently populate this only when we specify a
 template_id on the composition, which is all compositions in our
 applications.  We could suggest that this is required for all compositions
 and make this the handle to determine what schema to use for validation.

Yes, but we would also need a similar mechanims for all top-level XML
artifacts (archetype instances, extracts, PARTY? etc).

The other suggestion I have seen on the interwebs is to make the
xsi:schemaLocation attribute compulsory in instances

Composition xsi:schemaLocation=http://schemas.openehr.org/v1
   http://www.openehr.org/releases/1.0.1/its/XML-schema/Composition.xsd;

Composition xsi:schemaLocation=http://schemas.openehr.org/v1.1
   http://www.openehr.org/releases/1.1.3/its/XML-schema/Composition.xsd;

The schema checker would not actually go and 'fetch' the XSD but would
be looking for known URL's to indicate the exact schema version it was released
against..

http://www.openehr.org/releases/1.0.1/its/XML-schema/Composition.xsd
means 1.0.1 etc

So the schema namespace indicates the major structural version of
the schema - and the xsi:schemaLocation gives the exact schema version
that the instance was created for (even though the minor schema
differences between 1.0.2 and 1.0.3 may not have any data changes).
I don't know whether storing this version is any use - I guess it depends
on the definition of 'breaking changes' I discussed above.

Andrew



{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-15 Thread Thomas Beale

I can see some better schemes in the offing from the experts! I would 
not think we should change anything in the current schema approach, as 
this is a minor release. I propose to change only the version ids in the 
relevant schemas to reflect the small change in Base_types. We will 
leave it to a new major version (1.1 or later) to change tack on how to 
manage schema versions. I would suggest that XML experts here would need 
to develop a bullet-proof approach to this (as opposed to just 
suggestions), so that we can implement it in a major release.

- thomas beale

Heath Frankel wrote:
 Hi Andrew,
 See below.

 Heath

   
 The question is, what do we do when we do have a data breaking schema
 change, like potentially in r1.1?  I suggest that we just go with
 http://schemas.openehr.org/v1.1 and we can assume the old
 http://schemas.openehr.org/v1 meant r1.0.x.  It wasn't expected to have
   
 such
   
 a substantial schema change until r2 but I guess that is the reality of
 software.
   
 I am also concerned about 'non-breaking' changes though - well, it depends
 on your definition of a breaking change - but say we add an
 optional element to a xml type? This doesn't break any existing data
 because all instances will still comply with the new schema - however,
 new instances will now potentially not comply with the old schema. Do
 we consider these breaking changes? Are we expecting to create a
 new namespace if we do this type of change?
 

 [Heath Frankel] 
 I understand your point here but if we cannot have some kind of schema
 migration mechanism we will need a new schema per release, which is
 something that I don't think anyone wants.

 Using your example, adding an optional element will cause systems that use
 an older schema to invalidate an instance that populates that element but at
 least an older system can continue to produce validate instances.  I am not
 sure how much schema validation will be used in a production system, the
 overhead to validate every instance may be to great so schema validation
 will probably be just a testing and accreditation issue.  

 We may need to have some parser rules a bit like HL7 V2 where you accept
 additional elements that you don't expect, within reason, so that we can
 support this forward-compatibility.  This will mean that we may not be able
 to use auto-generated XML serialisers but there are other XML APIs that can
 be used to support this kind of rules.  

   
 Jumping ship to another style such as http://schemas.openehr.org/2009/03
 would also be reasonable, it would just mean we have to correlate
   
 release
   
 dates with release numbers.
   
 This has one advantage in that it is possible to release new major
 versions of the spec _without_ updating the schema. So if v1.2 needs
 no schema changes over v1.1, the 1.2 released schemas can be left as
 http://schemas.openehr.org/2009/03 - and people won't be confused
 thinking that their data isn't the right 'version'.
 

 [Heath Frankel] 
 So your suggesting moving to a date oriented namespace so that there is no
 tie to the release number?
   
 There is an optional rm_version in the archetype_details attached to any
 archetyped locatable.  We currently populate this only when we specify a
 template_id on the composition, which is all compositions in our
 applications.  We could suggest that this is required for all
   
 compositions
   
 and make this the handle to determine what schema to use for validation.
   
 Yes, but we would also need a similar mechanims for all top-level XML
 artifacts (archetype instances, extracts, PARTY? etc).
 
 [Heath Frankel] 
 All archetyped locatables can have this and I would expect all top-level
 objects to be archetyped otherwise they have no domain semantics.
  
   
 The other suggestion I have seen on the interwebs is to make the
 xsi:schemaLocation attribute compulsory in instances

 Composition xsi:schemaLocation=http://schemas.openehr.org/v1

 
 http://www.openehr.org/releases/1.0.1/its/XML-schema/Composition.xsd;
   
 Composition xsi:schemaLocation=http://schemas.openehr.org/v1.1

 
 http://www.openehr.org/releases/1.1.3/its/XML-schema/Composition.xsd;
   
 The schema checker would not actually go and 'fetch' the XSD but would
 be looking for known URL's to indicate the exact schema version it was
 released
 against..

 http://www.openehr.org/releases/1.0.1/its/XML-schema/Composition.xsd
 means 1.0.1 etc

 So the schema namespace indicates the major structural version of
 the schema - and the xsi:schemaLocation gives the exact schema version
 that the instance was created for (even though the minor schema
 differences between 1.0.2 and 1.0.3 may not have any data changes).
 I don't know whether storing this version is any use - I guess it depends
 on the definition of 'breaking changes' I discussed above.
 

 [Heath Frankel] 
 I can see the utility of this, but I am not sure that we should mandate it,
 

{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-13 Thread Thomas Beale
An HTML attachment was scrubbed...
URL: 
http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20081213/9bec2dda/attachment.html
-- next part --
A non-text attachment was scrubbed...
Name: OceanC_small.png
Type: image/png
Size: 4972 bytes
Desc: not available
URL: 
http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20081213/9bec2dda/attachment.png


{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-12 Thread Heath Frankel
Thomas,

The original namespace was designed in a way that would not require a change
until there was a radical RM (or schema design) change.

 

I would suggest that the principles are similar to archetype versions and
revisions, the namespace will require a change when the schema change breaks
existing data, otherwise it is just a version change.

 

The version attribute in the schema does not affect the data at all, it is
for version control information to the users of the schema.  I don't care
what goes there but at the time the openEHR release seemed logical.  If a
new release changes the schema (in a compatible way) the release number of
that specific schema can change otherwise I don't see any need to upgrade
the version ID just because of a new release,  but as I said I don't care
that much because it has no affect on the data as long as I know what schema
I need to deploy with my version specific openEHR components.

 

The question is, what do we do when we do have a data breaking schema
change, like potentially in r1.1?  I suggest that we just go with
http://schemas.openehr.org/v1.1 and we can assume the old
http://schemas.openehr.org/v1 meant r1.0.x.  It wasn't expected to have such
a substantial schema change until r2 but I guess that is the reality of
software.

 

Jumping ship to another style such as http://schemas.openehr.org/2009/03
would also be reasonable, it would just mean we have to correlate release
dates with release numbers.

 

BTW Andrew,

There is an optional rm_version in the archetype_details attached to any
archetyped locatable.  We currently populate this only when we specify a
template_id on the composition, which is all compositions in our
applications.  We could suggest that this is required for all compositions
and make this the handle to determine what schema to use for validation.

 

Heath

 

From: openehr-technical-boun...@openehr.org
[mailto:openehr-technical-bounces at openehr.org] On Behalf Of Thomas Beale
Sent: Wednesday, 10 December 2008 10:23 PM
To: For openEHR technical discussions
Subject: {Disarmed} Re: Q on openEHR XML-schema versioning

 

Andrew Patterson wrote: 

ok - this approach more or less replicates the release id approach
already in use, but converts it to a URL.


 
Except, this is a change that occurs in all xml _instances_, not just
the schema files. So every reference model document in every
system in existence now has to handle two different schemas
and convert between them. We have to decide whether this is
what we want.. do the xml schemas aggressively track the
exact spec versions, or do we only increment xml schema
versions when necessary (and therefore should the xml
schemas have a separate version)
  


I don't think this is the case - each document should just indicate which
schema it is derived from. There will always be new and improved XML-schemas
- that is just the nature of a formalism that is inherently inefficient -
people will keep coming up with ways to improve it. Any document created
from a previous version of the schema will point to the earlier version.
Since all schemas in openEHR are designed to convert into the same reference
model, the data remain interoperable (unlike purely schema-based approaches
to health data). 

The main point it seems to me is what the schema should carry as its
namespace... is it (as today):



xs:schema xmlns:xs= http://www.w3.org/2001/XMLSchema MailScanner has
detected a possible fraud attempt from www.w3.org claiming to be
http://www.w3.org/2001/XMLSchema; xmlns= http://schemas.openehr.org/v1
MailScanner has detected a possible fraud attempt from schemas.openehr.org
claiming to be http://schemas.openehr.org/v1;
   targetNamespace= http://schemas.openehr.org/v1 MailScanner has
detected a possible fraud attempt from schemas.openehr.org claiming to be
http://schemas.openehr.org/v1; elementFormDefault=qualified
version=v1.0.2
   id=BaseTypes.xsd

or more like:



xs:schema xmlns:xs= http://www.w3.org/2001/XMLSchema MailScanner has
detected a possible fraud attempt from www.w3.org claiming to be
http://www.w3.org/2001/XMLSchema; xmlns= http://schemas.openehr.org/v1
MailScanner has detected a possible fraud attempt from schemas.openehr.org
claiming to be http://schemas.openehr.org/v1;
   targetNamespace= http://schemas.openehr.org/v1 MailScanner has
detected a possible fraud attempt from schemas.openehr.org claiming to be
http://schemas.openehr.org/v1.0.2; elementFormDefault=qualified 
   id=BaseTypes.xsd

I don't know what weight the 'version' attribute carries in the xs:schema
tag - I don't understand why there appear to be two ways of indicating the
version in fact.





 
The schema identifier is a URI - there is no requirement that it is
accessible at the same identifier, and in fact it seems like there
is a trend towards using other URN syntaxes rather than
URLs.
  


so sticking with the current style of URI is no problem.


- thomas beale

-- next part

{Disarmed} Re: {Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-11 Thread Thomas Beale
An HTML attachment was scrubbed...
URL: 
http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20081211/397539dc/attachment.html


{Disarmed} Re: Q on openEHR XML-schema versioning

2008-12-10 Thread Thomas Beale
An HTML attachment was scrubbed...
URL: 
http://lists.openehr.org/mailman/private/openehr-technical_lists.openehr.org/attachments/20081210/7e1ab55e/attachment.html