My vote goes to your current version. > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:cellml-discussion- > [EMAIL PROTECTED] On Behalf Of Andrew Miller > Sent: Thursday, 22 November 2007 9:24 a.m. > To: For those interested in contributing to the development of CellML. > Subject: [cellml-discussion] Question on community's preferred way to > resolve CellML 1.1 identifier contradiction > > Hi all, > > I have been working on writing up a purely normative, unambiguous draft > of the CellML specification to facilitate discussions of how to improve > CellML in the future. As part of this, I have been rewriting most of > the > text of the specification to follow good practices for normative > specifications. > > One thing I have noticed during this process is that CellML's current > text defining the format for CellML identifiers contradicts itself: > > " > > A valid CellML identifier must consist of only letters, digits and > > underscores, must contain at least one letter, and must not begin with > > a digit. This can be written using Extended Backus-Naur Form (EBNF) > > notation as follows: > > letter ::= 'a'...'z','A'...'Z' > > digit ::= '0'...'9' > > identifier ::= ('_')* ( letter ) ( letter | '_' | digit )* > > " > > > The EBNF specification does not permit an identifier like _1foo because > it does not contain a letter before an underscore, while the text of > the > specification does, because it contains only letters, digits, and > underscores, contains at least one letter, and does not begin with a > digit. > > One rule or the other will need to be decided for the next CellML > specification. > > I have, for now, taken the rule in the text as being normative and have > written it up. Note that I have not included an EBNF representation - > this will belong in explanatory notes which annotate the normative > specification. > > " > > Basic Latin alphabetic character > > A Unicode character in the range U+0041 to U+005A or in the range > U+0061 to U+007A. > > European numeric character > > A Unicode character in the range U+0030 to U+0039. > > Basic Latin alphanumeric character > > A Unicode character which is either a Basic Latin alphabetic > character or a European numeric character. > > Basic Latin underscore > > The Unicode character U+005F. > > > The following data representation formats are defined for use in > this specification: > 1. > > CellML identifier: > > 1. > > SHALL be a sequence of Unicode characters. > > 2. > > SHALL NOT contain any characters except basic Latin > alphanumeric characters and basic Latin underscores. > > 3. > > SHALL contain one or more basic Latin alphabetic characters. > > 4. > > SHALL NOT begin with a European numeric character. > > " > > > Please let me know if you have an opinion on whether we should instead > base this off the validity rules specified in the EBNF form from CellML > 1.1. > > Best regards, > Andrew > > _______________________________________________ > cellml-discussion mailing list > [email protected] > http://www.cellml.org/mailman/listinfo/cellml-discussion
_______________________________________________ cellml-discussion mailing list [email protected] http://www.cellml.org/mailman/listinfo/cellml-discussion
