Re: [cellml-discussion] Proposal: BCP for including external codeinCellML models

Andrew Miller Sat, 17 Mar 2007 17:25:55 -0800

David Nickerson wrote:
>> ECMAScript is not practical for use in modelling, because it is an 
>> interpreted, non-typed language, which necessarily means that it cannot 
>> be compiled and will be slower than compiled code.
>>     
>
> But CellML is an language for the description and exchange of 
> mathematical models. It is not meant to be a one-off wonder describing 
> the most efficient and best performing method for executing numerical 
> computations.
>
> To turn a CellML model description into something useful for computation 
> that description has to be interpreted and compiled into some other 
> format suitable for the environment using it...
>
> Surely in the same manner, a standard description of procedural code 
> could then be interpreted by any number of applications in whatever 
> manner they feel best suits their environment?
>   
No, because due to the restriction of CellML to expression, it is much 
easier to work with, and this is what makes it declarative. You can 
perform a variety of manipulations on declarative expressions, but 
procedural code can basically only be run in the way it was written to 
run (for example, even working out whether procedural code will ever 
terminate, 'The Halting Problem', has been proved to be non-Turing 
computable in the general case, and this is likely to be the case for 
other types of manipulations too).


Code can often be optimised and compiled, but the features of ECMAScript 
preclude many of the optimisations that a C compiler, for example, can make.

For example, objects can have arbitrary properties, and there is no way 
to tell at compile-time what set of properties an object will have, or 
whether a property is a simple property or a getter. While a C compiler 
might take a value from an offset into a structure, ECMAScript code 
would end up searching a dictionary of properties on an object. 
Therefore, ECMAScript is not a good language if you want to be able to 
interpret it in different ways (and for any Turing-complete language, 
the ways in which you can interpret it are severely limited).

Remember also CellML models can be used to solve a range of different 
problem types (fitting, ODE time course, and so on), but one procedural 
code implementation might not be useful for all of them.

My BCP document is intended as a way to maintain as much of the model as 
possible in CellML, but simply leave the rest of the model unspecified. 
Given the amount of history and development of procedural languages, I 
don't think we can hope to 'standardise' anything more in a widely 
acceptable way when it comes to procedural languages.
>   
>> External code needs to be extensible, and hence outside the scope of the 
>> CellML specifications, for several reasons:
>> 1) Performance. Code may need to be written in a way which is specific 
>> to a particular platform in order to be able to perform well.
>>     
>
> some response as above.
>   
Sometime, human intervention is always going to be required to save a 
model from unfeasible performance issues. If we take an ideological 
approach and try to block this from happening, it will just result in 
CellML not being used at all. Instead, it is better to encourage people 
to use CellML features whenever possible, but allow external code when 
it is not possible.
>   
>> 2) Access to existing libraries. There are often extensive libraries and 
>> other software packages into which a model needs to be integrated. This 
>> could be in practically any language, and so it would be necessary to 
>> access to data structures of these libraries to have the model work. I 
>> believe that this is the case for much of the CMISS-CellML work (I don't 
>> really think that a proposal to re-write CMISS in ECMAScript would be 
>> very popular!).
>>     
>
> In every case of people using CMISS that I know of, the use of CellML is 
> to define model specific mathematical equations for integration into a 
> larger model.
In other words, the model consists of parts which can be expressed as 
mathematical equations, and parts that cannot be expressed in 
mathematical equations (in CMISS). You are proposing that the parts 
which cannot be expressed in mathematical equations be written in 
ECMAScript.
>  I'm not suggesting re-writing CMISS in ECMAScript - rather 
> you seem to be suggesting including CMISS in a CellML model?!?
>   
The question of which model is included in which is more an artificial 
distinction than anything more meaningful. However, there needs to be a 
mechanism for data flow from CMISS into the CellML models (otherwise, 
CMISS can only set initial conditions, it can't have any time dependent 
influence on the model).
> This would hold for most such cases of using existing libraries that I 
> can think of, with the exception of someone wanting to solve a 
> particular equation or set of equations in a model using a very specific 
> numerical method that their CellML simulation tool does not support.
>   
There are many other computations that are better done by procedural 
code than by systems of ODEs. Machine learning algorithm lookups are one 
example of this, and there are extensive libraries of these sorts of 
things available.
> Even if you take a step back and look at the larger picture of using 
> things like FieldML, CellML, MathModelML (or something), etc... to 
> describe something like an electrical propagation model in the heart, 
> the tool (eg, CMISS) pulls it all together and plugs fields and 
> variables together based on the model annotations. Otherwise you'll end 
> up with cell models that say things like "give me the current load at 
> this point in space by solving the bidomain model over this geometric 
> domain" - making the cell model description useless for any other 
> application. What you rather want is a simply a variable in the model 
> which is the current load that has an interface of in. Your cell model 
> integrator doesn't care where this value comes from, it just knows that 
> when the tool calls for the cell model to be integrated that it will 
> provide some appropriate value.
>   
I firstly note that if you are talking about using component-level 
interfaces for this, that is not a feasible approach. I include an 
e-mail I sent to Shane and Poul about this earlier below:

"
Shane has proposed that as an alternative to using content MathML to 
reference external code, we could use components. However, this appears 
to be inconsistent with the way CellML works at the moment, so I don't 
think that it could form the basis for defining external functions.

The problem with the approach of defining external components is that 
the directionality of variable interfaces in CellML is too weak to 
define the actual directionality and order in which mathematics is 
evaluated.

This is a good thing, for two reasons:
1) CellML is inherently declarative, not procedural. This means that if 
you give an equation defining x in terms of a, b, and c, but due to the 
other components in the model, x, a, and b are known, and it becomes 
necessary to obtain c, it is perfectly valid for the CellML software to 
perform a Newton-Raphson solve (or algebraic manipulations, if it has 
the capability) to obtain c. However, if the directionality on 
components was strong, CellML processing software would be constrained 
to compute components in a certain way, which would in turn limit the 
flexibility of each component.

2) It is possible to have more than one mathematical equation in a 
single component, and in some cases these might be completely 
independent. For example, you might have, in one component:

 w = x + a
 y = z + b

and in another:

 z = w + c

With x the bound variable of integration, and a, b, and c being constant.

This might make sense, because components are generally used to 
represent entities in biology, rather than the actual directionality of 
mathematical equations. However, it means that you evaluate part of the 
first component, then part of the second, and then go back to part of 
the first component. This is something you couldn't do if each component 
was an external block.

Given that we don't have a one equation per component system, it is also 
possible that you want to combine mathematics in MathML with the 
external code (perhaps to re-parameterise the function, or something 
like that).

Because of this, I am still convinced that defining external operators 
using MathML is a better approach than trying to overload the component 
system in CellML for a use other than what it was originally intended.
"

Secondly, the "cell model" integrator does need to care where the values 
come from, because it is responsible for moving from one time point to 
the next, and to do this, it needs to know what values from the current 
time point are needed to compute which other values at the current time 
point. This is why I have defined an interface which, in a very MathML 
natural way, describes the inputs and outputs of the external code, 
which is essentially equivalent to what you are talking about above, 
except the inputs to the external code must be provided as well.

>   
>> 3) Access to specialised hardware. A model could potentially even 
>> require that a function is evaluated by some sort of online experimental 
>> procedure (perhaps automated probing of a hardware model) for a given 
>> set of inputs.
>>     
>
> Again, this seems more like a case where you define a mathematical model 
> which given some input(s) produces some output(s). The controlling 
> software would take the mathematical model definition in CellML and 
> connect the appropriate inputs and outputs.
This is exactly why we need a way to describe inputs and outputs, which 
is what I describe in the proposal.
>  I would really need a 
> concrete example of why you would want to describe a mathematical model 
> in CellML which requires input from specialised hardware. Surely you 
> just define a variable that has an interface of in and annotate it such 
> that the controlling software can find it and plug in the appropriate 
> value(s)?
>   
>   
>> 4) Multiple standards, with different communities who favour them. It 
>> would not be practical to get everyone involved with CellML to agree on 
>> a certain procedural programming language (even deciding on Fortran vs 
>> C++ etc... has been a challenge at this institute, and will probably be 
>> impossible for the wider CellML community).
>>     
>
> As above, you are not performing computations using CellML directly - 
> you always turn the model description into something suitable for the 
> computational environment in which the model is being used. Thats the 
> beauty of CellML - you can turn it into Fortran or C++, depending on 
> your personal preference!
>   
For CellML, it is irrelevant what language it is translated through, 
because it can't call external code anyway. But if we call external 
code, that external code can further call other external code. Also, 
CellML filled a new niche, while you seem to propose that we tell 
everyone which language to use, which is a contentious issue. Also note 
that you cannot turn ECMAScript into efficient C++ in general.
> CellML is all about being able to exchange a standard description of a 
> mathematical model between potentially very different software 
> environments. The whole idea is specifically not specifying the best way 
> to compute outputs from the model - which seems to be what you are 
> driving at....and the best way to compute outputs from a model is always 
> going to be dependent on the target computational environment.
>   
Which is why we keep the things that CellML can do well in CellML, while 
continuing to not specify how the things that CellML can't do well. That 
is why my proposal only provides details of the interface to the 
external code, and doesn't try to specify the external code itself.
>   
>> As an example, consider my PhD project, where I plan to put machine 
>> learning components into CellML models:
>> 1) Performance is likely to be important. If it is too slow, it might 
>> not be feasible to do at all.
>> 2) I plan to use existing libraries, in a range of different languages.
>> 3) I also have another (perhaps not as common) gain from specifying the 
>> external functions without describing their details: I need to run 
>> different code in 'training' and 'simulation' modes, and if I just wrote 
>> generic ECMAscript for the simulation case, there would be no simple way 
>> to deduce the training case. Because of this, it is probably good to 
>> keep the non-algebraic parts of the model completely separate, and leave 
>> it up to whoever implements the specific CellML processor.
>>     
>
> I'd probably need to see more detailed plans on exactly what you are 
> planning on doing before commenting on this. But from what I have seen, 
> whenever anyone has wanted to include procedural code directly in a 
> CellML model it has always turned out that they are approaching the 
> problem from the wrong direction.
>
> Just to re-iterate, CellML is all about exchanging *descriptions* of 
> mathematical models - not implementations of computational code.
>   
Which argues for specifying how to interface external procedural code, 
as in my original proposal, rather than specifying how to exchange the 
procedural code, as you have suggested.
>   
>> That said, I think we could have multiple levels of degeneracy away from 
>> standardised code, where you only go down to the next item if the 
>> current one is impossible:
>> 1) Pure CellML.
>>     
>
> definitely.
>
>   
>> 2) CellML with standardised Turing-complete code support.
>>     
>
> I can see why we should provide a mechanism for this, but have yet to 
> see an example where it would be useful (other than to get around a 
> particular tool's deficiencies).
>
>   
>> 3) CellML with external (non-standardised) code.
>>     
>
> I still haven't seen a reason why this would ever be required?
>
>
> David.
>
>   

Best regards,
Andrew

_______________________________________________
cellml-discussion mailing list
cellml-discussion@cellml.org
http://www.cellml.org/mailman/listinfo/cellml-discussion

Re: [cellml-discussion] Proposal: BCP for including external codeinCellML models

Reply via email to