Hi, I am looking at refactoring the CCGS into several components, as discussed in an earlier e-mail. As part of this, I am looking at how I can separate out the language specific parts of code generation. At this stage, I am focusing on how expressions get generated, rather than entire assignments. This will then be combined with code to generate the procedural steps required to evaluate a model. Writing a program to generate code for a new language will then be as simple as iterating through the procedural steps, writing out assignments of expressions into variables, in addition to supplying all the language specific glue to the integrator.

I have defined a file format specification, called MAL (or MathML-language mapping) designed to contain all the information needed to generate expressions for a specific programming language. I would welcome any feedback anyone may have on the specification. I would be particularly interested in hearing if you can think of some extension to the language which is needed to support generation for a certain language. The specification follows... MAL Format is intended as a succinct but complete description of how to translate expressions from MathML into the syntax of another programming language. It is intended to be both simpler but more powerful (within the problem domain it is trying to address) than more generic approaches such as XSLT. Format: The format consists of a series of tags. Each tag has a series of alphanumeric characters(the tag name), followed by a collon and a space (": "), followed by a series of characters (the tag value). The tag is terminated by a carriage return or line-feed character, and the next tag starts at the first character which isn't a carriage return or line feed. Where line-length formatting transforms (such as for FORTRAN 77), a post-processing stage must be used to achieve this. The reason for this design decision is that expressions alone do not determine line length. The following tags are defined: Name: opengroup Value: A string which can be appended before another string to force that string to have the highest precedence. Examples: opengroup: ( Sets the open group string to be (, which is the open group character in languages like C. Name: closegroup Value: A string which can be appended after another string to force that string to have the highest precedence. Examples: closegroup: ) Sets the close group string to be ), which is the close group character in languages like C. Name: The name of any MathML operator. Value: A string describing the format. This string shall start with a description of operator precedence in the target language, and then describe a pattern for generating the target language expression. A precedence description is specified between #prec[ and ]. The following precedence descriptions can be used: #prec[n(m)] where n and m are integers between 0 and 1000. Sets the outer precedence to n (this is a precedence score for the resulting expression), and the inner precedence to m (this is a precedence score below which operands must be if they are not to require opengroup / closegroup strings around them. #prec[n] where n is an integer is a shorthand for #prec[n(n)] #prec[H] is a shorthand for #prec[1000(0)]. In an operator description, character sequences which are not matched below are written directly out to the output mathematics. #expri reference the recursive expansion (according to the rules in the MAL file) of the ith operand, where i is a positive integer. The highest i value present also acts as the number of operands which must be present in the MathML to avoid an error. #exprs[text] expands to the concatenation of each consecutive operand after expansion according to the rules. The string text intervenes between operands, but is not added before the first operand or after the last. #logbase expands to the expansion of the logbase element contents. This is only valid for log. If no logbase element is found, the string 10 will be inserted. #degree expands to the expansion of the degree element contents. It is only valid for root. If no degree element is found, the string 2 will be inserted. #bvarIndex expands to the text of the bvarIndex annotation (as retrieved by the AnnotationSet supplied to MaLaES) on the source of the bound variable referenced. #uniquen (where n is an integer) expands to a globally unique integer. If uniquei (for the same i) is used more than once in the same line, it refers to the same number. However, a different number is generated each time a rule is processed. #lookupDiffVariable (only valid on diff) finds the ci associated with the diff (differentiation of something other than a variable is not supported by this form, and will result in an error), and then finds the source variable associated with that ci. It then asks the supplied AnnotationSet for the degreeiname, where i is the degree of the diff. #supplement causes all subsequent output to be put into the supplementary stream, instead of the main output stream. Name: unary_minus Value: unary_minus works just like the MathML operator elements described above. However, the MathML operator minus is only processed according to the minus rule if it has two children. If it has one child, it is processed according to the unary_minus rule. If it has any other number of children, an error is raised. I also have created a complete example, describing how to generate C expressions: opengroup: ( closegroup: ) abs: #prec[H]fabs(#expr1) and: #prec[20]#exprs[&&] arccos: #prec[H]acos(#expr1) arccosh: #prec[H]acosh(#expr1) arccot: #prec[1000(900)]atan(1.0/#expr1) arccoth: #prec[1000(900)]atanh(1.0/#expr1) arccsc: #prec[1000(900)]asin(1/#expr1) arccsch: #prec[1000(900)]asinh(1/#expr1) arcsec: #prec[1000(900)]acos(1/#expr1) arcsech: #prec[1000(900)]acosh(1/#expr1) arcsin: #prec[H]asin(#expr1) arcsinh: #prec[H]asinh(#expr1) arctan: #prec[H]atan(#expr1) arctanh: #prec[H]atanh(#expr1) ceiling: #prec[H]ceil(#expr1) cos: #prec[H]cos(#expr1) cosh: #prec[H]cosh(#expr1) cot: #prec[900(0)]1.0/tan(#expr1) coth: #prec[900(0)]1.0/tanh(#expr1) csc: #prec[900(0)]1.0/sin(#expr1) csch: #prec[900(0)]1.0/sinh(#expr1) diff: #lookupDiffVariable divide: #prec[900]#expr1/#expr2 eq: #prec[30]#exprs[==] exp: #prec[H]exp(#expr1) factorial: #prec[H]factorial(#expr1) factorof: #prec[30(900)]#expr1 % #expr2 == 0 floor: #prec[H]floor(#expr1) gcd: #prec[H]gcd_multi(#count, #exprs[, ]) geq: #prec[30]#exprs[>=] gt: #prec[30]#exprs[>] implies: #prec[10(950)] !#expr1 || #expr2 int: #prec[H]defint(func#unique1, BOUND, CONSTANTS, RATES, VARIABLES, #bvarIndex)#supplement double func#unique1(double* BOUND, double* CONSTANTS, double* RATES, double* VARIABLES) { return #expr1; } lcm: #prec[H]lcm_multi(#count, #exprs[, ]) leq: #prec[30]#exprs[<=] ln: #prec[H]log(#expr1) log: #prec[H]arbitrary_log(#expr1, #logbase) lt: #prec[30]#exprs[<] max: #prec[H]multi_max(#count, #exprs[, ]) min: #prec[H]multi_min(#count, #exprs[, ]) minus: #prec[500]#expr1 - #expr2 neq: #prec[30]#expr1 != #expr2 not: #prec[950]!#expr1 or: #prec[10]#exprs[||] plus: #prec[500]#exprs[+] power: #prec[H]pow(#expr1, #expr2) quotient: #prec[900(0)] (int)(#expr1) / (int)(#expr2) rem: #prec[900(0)] (int)(#expr1) % (int)(#expr2) root: #prec[1000(900)] pow(#expr1, 1.0 / #degree) sec: #prec[900(0)]1.0 / cos(#expr1) sech: #prec[900(0)]1.0 / cosh(#expr1) sin: #prec[H] sin(#expr1) sinh: #prec[H] sinh(#expr1) tan: #prec[H] tan(#expr1) tanh: #prec[H] tanh(#expr1) times: #prec[900] #exprs[*] unary_minus: #prec[950]-#expr xor: #prec[25(30)] (#expr1 != 0) ^ (#expr2 != 0) Best regards, Andrew _______________________________________________ cellml-discussion mailing list cellml-discussion@cellml.org http://www.cellml.org/mailman/listinfo/cellml-discussion