Re: [cellml-discussion] CellML 1.1 to draft CellML 1.2 normative specification mapping

Andrew Miller Wed, 20 Feb 2008 17:30:06 -0800

Alan Garny wrote:
> Hi Andrew,
>
>   
>> I have now defined a mapping between CellML 1.1 and one draft of CellML
>> 1.2. I have put this up at:
>> http://www.cellml.org/Members/miller/mapping-1-1-to-draft-1.2/mapping
>>
>> It currently has all the mappings from 1.1 to a particular 1.2 draft,
>> although it may be missing some of the reverse mappings.
>>     
>
> I haven't had time to go through everything in detail (i.e. check that
> everything that is in the original CellML 1.1 Specification is also in your
> draft), but as far as I can tell (i.e. as far as I can remember the original
> CellML specification) it all seems fine to me and it will no doubt be very
> useful to anyone interested in CellML (it would certainly have saved me a
> lot of time when working on COR's CellML API!). Anyway, here are some
> comments which I hope will prove useful to you in some way or another (sorry
> in advance for the number of them!):
>
> - Section 1: you may want to provide a link to RFC 2119
> (http://www.ietf.org/rfc/rfc2119.txt I believe).
>   
Good point, I have now drafted a version which I will push to my public 
git shortly which adds RFC2119 as a normative reference.


> - Section 1: shouldn't we have a definition for a CellML 1.0 Namespace? I
> guess you may not have given one because your document is currently about
> CellML 1.1 and will then be targeting CellML 1.2?
>   
No, the specification mixes namespaces in accordance with the rules that:
  1) Any element which hasn't changed semantics from CellML 1.1 keeps 
the CellML 1.1 namespace in CellML 1.2.
  2) Any element which changes semantics goes in the CellML 1.2 namespace.

CellML 1.1 already changed the namespace for everything from the 
namespace used in CellML 1.0, so it makes no sense to mix CellML 1.0 
namespaces with CellML 1.1 or 1.2 namespaces. Because we never refer to 
CellML 1.0 namespaces, there is no point in providing a definition for them.

> - Section 1: would it be worth to provide a link where the reader could look
> up the Unicode characters given in some definitions (e.g. the Basic Latin
> alphabet; possible link: http://www.unicode.org/ and in particular
> http://www.unicode.org/charts/)?
>   
I don't think this is necessary in a normative specification - I don't 
think providing such a link would serve the goals of a normative 
specification, but this would of course be a good thing for an 
informative specification.

> - Section 2.2.1.d: did you really mean "element information element" or
> "element information item"?
>   
I have drafted a revision to fix this - thanks for finding that.
> - Section 2.3: should its title be "Element information items" instead of
> "Character information items"?
>   
The section says...
"
      An element information item in the CellML namespace MUST NOT 
contain any
      character information items, except for character information 
items which
      consist entirely of whitespace characters. 
",
which is the constraint on the occurrence of character information items 
in CellML documents, so the title in my draft seems to make sense to me 
at least.

> - Section 2.4.3: once again (see comment on Section 1 above), there is no
> reference to CellML 1.0.
>   
This is because elements in the CellML 1.0 namespace do not comply with 
CellML 1.2. Software which is attempting to process a document in 
accordance with the CellML 1.2 specification should give an error if 
they encounter an element in the CellML 1.0 namespace (if those software 
packages want to implement CellML 1.0 as well, they of course can, but 
compliance with the CellML 1.0 specification is not related to 
compliance with CellML 1.2 because, for a given CellML Infoset, the two 
specifications are mutually incompatible).

> - Section 2.4.4.a: I don't know whether this is the rendered version of your
> document that is responsible for it or not, but it might be worth
> emphasizing keywords such as local names (here, "RDF")?
>   
DocBook has an element called literal, which I think can be used for 
this purpose. I have changed the RDF and math cases in 
general-matters.xml, but not the remainder yet, because many of them are 
changed in various other drafts, and it might be easier to wait for some 
of them to be integrated in as the set of literals will change as a 
result of that process.

> - Section 2.5.4: I don't see the point of having "when the parent
> information item is not modified" and would therefore suggest to delete that
> bit. Indeed, it's a "SHOULD", so the CellML Processing Software may, for
> whatever reason, decide not to implement that 'rule'.
>   
The intention is that extension information on an element be preserved, 
as a matter of best practice, when a particular element in a document is 
not modified. If, on the other hand, that element is modified, then it 
would no longer make sense to try to preserve unrecognised extension 
information, because it relates to what an element used to look like. It 
is a SHOULD rule, but these define best practice, and so it is important 
we are precise about what best practice is.

> - Section 2.6.1: you may want to be consistent in the way you refer to
> attributes in general. Compare this section with Section 4.2.1 for example.
> - Section 3.3: are we missing 3.3.b and 3.3.c (in
> http://www.cellml.org/Members/miller/mapping-1-1-to-draft-1.2/mapping/toplev
> el.xhtml at least)?
>   
You are right that in many cases I have used "attribute" as a short-hand 
for "attribute information item" in a number of places. Because we are 
talking about XML Infosets, it is technically correct to refer to 
"attribute information items". Attribute information item would get a 
little unwieldy in places - would you be happy with defining attribute 
to mean attribute information item in the terminology section?

> - Sections 3.3.b and 3.4.b: shouldn't U+002D be introduced in Section 1?
>   
I don't think it needs to be - it is a literal Unicode character, and 
CellML makes normative reference to XML, which makes it clear that these 
are Unicode characters. Maybe an additional normative reference from 
CellML to Unicode is needed (the problem with this is that Unicode is an 
evolving standard, and we might not want to refer to a particular 
version, although XML does).

> - Section 3.4.c: shouldn't U+002E be introduced in Section 1?
> - Section 3.5.e: shouldn't 'e' and 'E' be referred to as U+0065 and U+0045,
> respectively? Again, shouldn't those be introduced in Section 1? I am not
> convinced and, as a result, I am not convinced about U+002D and U+002E
> either anymore.
>   
We do need to be careful that everything is unambiguous. There are lots 
of characters which people might potentially try to use for full stop 
and hyphen-minus (I have heard of models purporting to be CellML models 
with all sorts of Unicode characters in place of that character, usually 
as a result of people copying physical constants from online sources or 
word processors), although saying Basic Latin may be enough to make it 
unambiguous. I think that Basic Latin 'e' and 'E' are definitely 
unambiguous, although perhaps there is still a case for consistency of 
notation, perhaps by saying 'e'  (U+0065) or 'E' (U+0045).
> - Section 3.2-3.5: shouldn't we have a rule similar to 3.1.c?
>   
We already have "SHALL consist entirely of European numeric characters".
> - Section 4.2.2: not critical/essential, but I would personally have the
> order 4.2.2.d first, followed by e, a, c and b, i.e. the order in which a
> model might be written if it was to be written from scratch.
>   
I think 4.2.2 is essential - without it section 2.4.1 would mean that 
model element information items could only contain whitespace, RDF and 
extension element information item children.

In terms of the ordering, I did think about ordering schemes for the 
elements, and the reason I went for an alphabetical ordering was to 
choose an arbitrary ordering of something which doesn't matter.

You suggest the ordering import, units, component, group, connection, 
although I know of people who prefer very different orderings. Perhaps 
keeping things like this alphabetical might be the safest approach from 
the point of view of producing an uncontroversial draft.

> - Section 5: for consistency, you might want to consider having "... local
> name component..." rather than "... local name equal to component..." (see
> 2.4.4.a). If you were to go ahead with it, then it should also be done in
> the rest of the document (i.e. Sections 6-17).
>   
I have now changed the 'equal to component' to 
<literal>component</literal> in my draft version.
> - Section 5.1.2: not critical/essential, but I would again personally have
> the order c, a and b.
>   
or we could alphabetise this one properly - I guess that the MathML 
entry should be sorted as 'm', then units, then variable. I am now 
assuming that you mean the order is not critical, and not the section 
itself?

BTW, the text is supposed to mean that the order in the CellML Infoset 
is not important.
I don't think "... MAY contain zero or more specific information item 
children, each of which MUST be of one of the following types ..." could 
be interpreted as requiring that all of one type of children must appear 
before the first of the next type of child, but perhaps I am missing 
something.

> - Section 6.1.2.a: as for Section 2.4.4.a (see above), it might be good to
> emphasize the different values (here "in", "out" and "none") that a
> particular (here "public_interface") may have. Again, there are other places
> where this also applies (i.e. 6.1.2.b, 12.1.1, 16.1.2, 18.3.1.c, 18.3.4,
> 18.4.10, 18.6.2, 18.6.3.c.i-iv and maybe others that I might have missed!).
>   
Yes, I agree with going through and marking all literals as such, 
although I am not going to do this right now, because, especially in the 
case of 'in' and 'out', this is something which is changed completely in 
other draft versions out there.
> - Section 10.1.1: if only encapsulation grouping is to be supported (dixit
> your mapping document), we might then allow one and only one
> relationship_ref element (rather than one or more of them).
>   
I actually intend to put forward a draft in which relationship_ref is 
deleted altogether, and group is replaced with encapsulation.
> - Sections 11 & 12: shouldn't those sections be swapped, so that
> relationship_ref elements are discussed before component_ref elements (i.e.
> the order in which they are introduced in Section 10.1)?
>   
The rationale for the current ordering was to make it alphabetical, but 
you are right, whatever we do it would look better if they were 
consistent with 10.1.
> - Section 12.1.1: you make a reference to the fact that the relationship
> attribute "MUST either take the value encapsulation or containment".
> However, in your mapping document, you mention that "containment [is] not in
> this draft version [and that] this needs further discussion".
>   
Good catch, I have updated the mapping document to clarify.
> - Section 13.1.2: the formatting is not consistent with say 10.1. One might
> expect a list, as well as a reference to Sections 14 and 15.
>   
Fixed in my draft version.
> - Section 16.1.1: lack of consistency. You have "... the value of the name
> attribute MUST NOT be identical to the name attribute of any other units
> element child of that model element..." while in Section 6.1.1.a you have
> "... The value of the name attribute MUST NOT be identical to the name
> attribute on any sibling variable element." In other words, in one case you
> refer to the child of a parent element while in another you refer to sibling
> elements.
>   
I think that sibling elements are semantically the same thing as the 
other child elements of the parent element, so can be used 
interchangeably depending on which one is the least unwieldy in the context.

In 6.1.1.a, variable element information items can only occur as a child 
of a component element information item, and so there is no need to 
mention the component element information item. It is therefore in the 
interest of brevity to not mention the component and use the word sibling.

In 16.1.1, we have to reference the parent element already, because the 
exact scope in which unit names must be identical depends on the parent. 
Because we already have made reference to the parent element, it seems 
to make sense to discuss the other children of that parent (although 
perhaps there is still a wording that would be just as readable based on 
the concept of sibling elements).

> - Section 16.1.3: for consistency, you may want to rephrase "A units element
> MAY contain one or more unit element children" to "A units element MAY
> contain one or more unit elements" (see Section 10.1.1 for example)?
>   
It looks like 10.1.1 is the only case which says elements instead of 
element children. I think it is slightly more clear to say contain ... 
children.

> - Section 18.1.5: typo: "It is noted..." and not "Its is noted..."
>   
Thanks, fixed that.
> - Section 18.2.2: typo: should it read "In all other cases, [a variable
> reference] SHALL consist of..."?
>   
I have now fixed this in my draft too.
> - Section 18.3.1.c: to use quotes might be the way to emphasize things (see
> Sections 2.4.4.a and 6.1.2.a).
>   
I have now changed my draft to use <literal> here.
> - Section 18.3.1.c: do we really need this condition considering that you
> mention in your mapping document that "name [is] not defined for now..."?
>   
"name" is one of the loose ends which has deliberately not been tidied 
up - I plan to resolve this as part of my proposal to replace group with 
an encapsulation-only element.
> - Section 18.3.4: as for Section 18.3.1.c, maybe the reference to the name
> attribute should be dropped?
>   
Yes, when this is tidied up properly.

> - Section 18.3.4-9: should we refer to encapsulation considering that,
> again, you mention in your mapping document that "... there is a proposal to
> only support encapsulation grouping"?
>   
In the version of the specification describing the proposal, we 
certainly should tidy this up.

> - Section 18.4.5: I am not quite clear what you mean by "without regard to
> whether the variable_1 attribute of one map_variables element references the
> variable referenced by the variable_1 or variable_2 attribute of the other".
> Could you please clarify?
>   
The text is trying to say that the rule about not duplicating 
connections applies even if the component_1 / variable_1 and component_2 
/ variable_2 pairs are reversed in the duplicating connection. I 
personally don't find this unclear, but that may be because I wrote it, 
so perhaps you would be better placed to suggest an alternative formal 
definition of the rule?

> - Section 18.4.8: is there really a need for this section (it's analogous to
> Section 18.4.7)? I guess it might be there just for completeness, which
> cannot harm.
>   
Although it seems repetitive, it is essential to the specification 
because A refers to the variable referenced from component_1 / 
variable_1 and B refers to the variable referenced from component_2 / 
variable_2. Because of the different attribute names, the two rules are 
not related by symmetry, and both are essential to specifying CellML fully.

> - Section 18.4.14: typoe: "... there MUST be exactly..." and not "... there
> MUST be at exactly..."
>   
Thanks, fixed this.
> - Section 18.5.2: I find that rule confusing and (mis?)understand it as
> being that all of the units scoping rules must apply to a units reference,
> which can obviously not always be the case (Section 18.5.4.a won't apply
> most of the time for instance), not to mention that this contradicts Section
> 18.5.3 which implies that 1 to 4 rules may apply...
>   
I agree, you are correct that the wording allows that interpretation, 
which is not the intended interpretation.

The intended interpretation was (not (there-exists units-reference u: 
(not (there-exists scoping-rule r: (r applies to u))))) rather than (not 
(there-exists units-reference u: (there-exists scoping-rule r: (not (r 
applies to u))))).

It is actually quite add to unambiguously express the intended 
interpretation as is in English, but it is logically equivalent to:
(forall units-reference u: (not (not (there-exists scoping-rule r: (r 
applies to u))))) <=>  (forall units-reference u: (there-exists 
scoping-rule r: (r applies to u)))  which could be written as:
"Every units reference in a CellML Infoset MUST have at least one 
scoping rule which is applicable to it". This seems unwieldy too, we 
could go for the double negative form above:
"Every units reference in a CellML Infoset MUST NOT be defined so that 
there are no scoping rules applicable to it."

Alternatively, perhaps we want to formulate it from (not (there-exists 
units-reference u: (forall scoping-rule r: (not (r applies to u))))), 
which would be:
 "A CellML Infoset MUST NOT contain a units reference to which all 
scoping rules are inapplicable."

I think that this last form might be the clearest, so I have put it in 
my draft.

> - Section 18.5.4.b: I find the phrasing cumbersome. Why not stick to
> something similar to Section 18.5.4.a, i.e. "where a units reference appears
> in an information item which is descended from the model element, and there
> is a units element child of that model element with a name attribute
> identical to the units reference, then the units reference SHALL refer to
> that units element"?
>   
Thanks, that sounds much clearer. I have updated my draft with this.

> - Table 1: for Celsius, should the offset be -273.15? I guess it all depends
> on how one interpret the offset...
>   
|The positive sign on the offset is consistent with the definitions 
later on under interpretation of units, and is consistent with the 
definition of the multiplier (e.g. the multiplier on gram is 10^-3, 
because given a number in grams, you multiply it by 10^-3 to get the 
number in SI base units, kilograms, and likewise, given a number in 
degrees Celsius, you add 273.15 to get the number in SI base units, Kelvin).


|| ||<units|| ||name="||fahrenheit||"||>|||
| 
  ||<unit|| ||multiplier="||1.8||"|| ||units="||celsius||"|| 
||offset="||32.0||"|| />|||
| ||</units>|

CellML 1.0 and 1.1 are really quite messy and contradictory with regards 
to units, and the Fahrenheit / Celsius example uses multiplier in a way 
which contradicts the formulae and some of the other examples. It gives 
examples like:
<units name="gram">
  <unit multiplier="0.001" units="kilogram" />
</units>
  - to get a number in kilograms, multiply the number of grams by 0.001

|<units|| ||name="||litre||"||>||
  ||<unit|| ||multiplier="||1000||"|| ||prefix="||centi||"|| 
||units="||metre||"|| ||exponent="||3||"|| />||
||</units>
 - to get a number in cubic centimetres, multiply the number of litres 
by 1000.
|

| ||<units|| ||name="||inch||"||>||
  ||<unit|| ||multiplier="||2.54||"|| ||prefix="||centi||"|| 
||units="||metre||"|| />||
||</units>|
  - to get a number in centimetres, you multiply the number of inches by 
2.54.

but...|
||<units|| ||name="||fahrenheit||"||>||
  ||<unit|| ||multiplier="||1.8||"|| ||units="||celsius||"|| 
||offset="||32.0||"|| />||
||</units>|
 - the normal rule is to get a number in degrees Fahrenheit, you 
multiply the number in degrees Celsius by 32.0, and then add 32.0, which 
means that the interpretation of multiplier has changed from the normal 
meaning, and it is hard to use this as a good example. The formulae 
(despite other errors), however, do add the offset to get the simple 
units, and since this seems the most consistent it is what I have 
implemented.

> - Section 18.6.3.a: do you really mean a "prefix 1" and not a "multiplier
> 1"?
>   
Thanks, I have fixed this in my version of the draft.
> - Sections 18.6.3.c.i & 18.6.3.c.iv: inconsistency when it comes to
> referring to 0/zero. In the former case, "zero" is used while "0.0" is in
> the latter.
>   
Fixed.
> - Sections 18.6.3.c.ii-iv: do we really need the decimal for "1.0", "10.0"
> and "0.0"? If so, then we may as well be consistent and a decimal in Section
> 18.6.3.a.
>   
I think that the 1.0 form might be good style for a real number, 
although I don't really strongly oppose using "1" or "10" either. For 
now, I have made them all use the 1.0 type of form.

> - Sections 18.6.3.c.v-vii: I haven't checked this in detail, but this looks
> 'too simple' to me. Has this been checked against Jonathan Cooper's paper
> (DOI: 10.1002/spe.828)?
>   
The goals of this are quite different from Jonathan Cooper's paper, so I 
am not sure how one would go about checking against that paper.

The draft does not attempt to describe how to check mathematics against 
CellML specifications - this is something which I consider to be outside 
the realm of what a normative specification should define. It merely 
ensures that there is enough information to perform units conversions on 
interfaces, which is mandatory for all implementations, and I believe 
that the interpretation of units section contains enough information to 
carry this out.

> - Section 18.8.1: is it really necessary?
>   
18.8 needs a lot of work - I am hoping that a more proper way to write 
it will become apparent in the process of drafting extensions to units. 
18.8.1 really just sets the scope of what needs to be written, it 
probably shouldn't stay in the final version.
> - Sections 18.8.4 & 18.8.5: analogous rules, so you may want to make the end
> of the rules consistent.
>   
Yes, although I think more important is to first decide what the end is 
- we need a better definition of 'the conditions that the initial value 
holds' at which point we can ensure that both ends are identical.

> - Sections 18.9.2.a-c: maybe I should think about this a bit more, but don't
> those sections mean that all component elements are "pertinent component
> elements"?
>   
No, there can be components in an imported CellML Infoset (i.e. not in 
the top-level CellML Infoset), and not imported into the toplevel model, 
and not encapsulated underneath anything imported into the toplevel 
model. These non-pertinent components which are in imported CellML 
Infosets but which were not themselves imported are not part of the 
model, and by the following section, the mathematics in these components 
are not included in the mathematical model.
> - Miscellaneous: would it be worth referring to various parts of Section 18
> wherever suitable in the document (Sections 18.5-8 are referenced, but not
> the others)?
>   
Perhaps, although the interpretation section (18 in the draft you refer 
to) differs quite markedly in structure from anything in CellML 1.1 so 
it is not always clear what should reference what.
> - Miscellaneous: you use an hyphen in "top-level" while there are other
> cases where an hyphen would also be useful (e.g. "built-in").
>   
I have now changed all occurrences of "built in" in my draft to built-in.

Best regards,
Andrew

_______________________________________________
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion

Re: [cellml-discussion] CellML 1.1 to draft CellML 1.2 normative specification mapping

Reply via email to