-------- Original Message --------
Subject:        Re: Questions about the CCGS, and 3 possible bugs
Date:   Wed, 25 Oct 2006 10:50:55 +1300
From:   Andrew Miller <[EMAIL PROTECTED]>
To:     Jonathan Cooper <[EMAIL PROTECTED]>
References:     <[EMAIL PROTECTED]>



Jonathan Cooper wrote:
> Hi Andrew,
>
> You may be aware that a group of people are writing a review article 
> on CellML and associated tools, for a special issue of Progress in 
> Biophysics and Molecular Biology.  I've been put in charge of the 
> section on code generators, so have been looking at the CCGS to see 
> what I can write about it.
>
> Firstly, I've found a few things that may be bugs.  The first is that
> whatever model I generate code for, the size of the RATES array is
> declared to be 1, despite there clearly being more than 1 rate
> variable (e.g. for the Beard 2005 model there are 19, but the header
> still has RATES[1]).
This turned out to be a bug in the Plone product (which provides a 
web-interface to CCGS). I have made a fix for this, it just needs to be 
deployed onto the website (I will speak to our webmaster about getting 
this done).
>
> The second is that a couple of models give a UnicodeDecodeError with
> "'ascii' codec can't decode byte 0xc3 in position 8548: ordinal not in
> range(128)".  The Hodgkin-Huxley model in the repository is one such.
> Skimming through the CellML source, I can't spot any obvious
> occurrences of wierd characters, so do you have any idea what might
> give rise to this?  It could be useful to work around such errors, in
> any case.
I tried this with the command-line version of CCGS, and it works okay. I 
don't think this is a CCGS bug, but rather an issue with the Plone 
product again (it is trying to convert from ASCII to UTF-16, when we 
should be doing UTF-8 to UTF-16, because the CCGS works with UTF-16 
strings). I will look into what can be done about this.

>
> The third is the handling of division.  As I understand it, all
> numbers in CellML models are implicitly double valued.  However, for
> the XML fragment
>   <apply><eq/>
>     <ci>V</ci>
>     <apply><divide/>
>       <cn cellml:units="volt">1</cn>
>       <cn cellml:units="dimensionless">2</cn>
>     </apply>
>   </apply>
> the CCGS generates
>   VARIABLES[0] = (1/2);
I have now fixed the CCGS to always put decimal points on values taken 
from cn attributes, and likewise for unit conversion factors / offsets.
> which will perform an integer division (in C).
>
>
> Now for some more general comments.
>
> Is the source code for the CCGS available anywhere?  (Is it open
> source?)
CCGS is tri-licensed under the GNU General Public License, Mozilla 
Public License, and GNU Lesser General Public License. It is essentially 
an optional extension module to the CellML API, and so is shipped with it.

The SVN repository is still not yet public (our IT group are working on 
it, and apparently making progress). However, snapshots of PCEnv and all 
dependencies are now being regularly put on FTP (PCEnv uses it, so its 
source code gets automatically released whenever a Linux binary is 
made). You can download the CellML API snapshots from 
ftp://ftp.bioeng.auckland.ac.nz/pub/physiome/cellml_api/snapshots/source 
(the snapshot process is automated, but manually initiated at the 
moment, but you can e-mail me and I can run the script).

When you get the CellML API, look at interfaces/CCGS.idl and if you 
want, the files in CCGS/sources.
> What language is it written in?  (The UnicodeDecodeError is
> Python, but I seem to recall the CCGS is in C.)
The generated code is C, but the generator is in C++, like the rest of 
the CellML API. The Python error you are seeing is from the CellML 
repository Plone product, which is in Python, but accesses the CellML 
repository across CORBA.

> Is there any documentation on the algorithms it uses?
Aside from the source code, and any comments in the code, no.

> My thinking for the structure of the code generation section is to
> start with some motivation - interpreting XML is slow, and people want
> to plug models into their existing simulation software. 
You should probably also mention that CellML is a declarative language, 
not a procedural one (I think this is often misunderstood by people new 
to CellML), and so it states what the relationships between the 
variables are, rather than a direct process for computing variables. 
Whether or not you are interpreting XML, generating 'source' code of 
some form, or directly generating machine code, at some point you need 
to translate from a declarative view into a procedural view, and this is 
the key role that these frameworks play.
> Then I'll
> discuss common features of the various code generators - they all view
> a model as an ODE system, for example (in fact, I think they all only
> treat initial value problems).  This part will probably need to
> include some comments on how variables are classified (as
> constant/computed/rate/etc.).
The framework generalises to support initial value problems, but you 
could still use it to compute expressions in terms of initial conditions.
>
> Then I'll consider features of each tool, asking questions such as
>  Which languages (and simulation software) do the tools target?
Currently CCGS targets C (hence why it is called C Code Generation 
Service), although there are future plans to split the common parts out 
and write generation services for other languages.
>  What assumptions do they make about the input model?
There are a few assumptions the current code makes (although these may 
be relaxed in the future):
1) Equations involving differentials must have the differential by 
itself on one side of the equals sign.
2) Every variable is assumed to be real valued (no complex numbers, 
vectors, matrices, etc...).
3) Set logic and propositional calculus is not supported, aside from the 
logical operators (e.g. you can't define a summation or integral over a 
set). However, you can define complex logical expressions using and, or, 
and the other binary/unary logical operators as the condition of a 
piecewise equation.
>  Is there any flexibility in output format?
CCGS is an API, not a program targeting end-users, and the API provides 
general information about the code, as well as 'fragments' of code 
(which compute certain parts of the model). The fragments contain a 
block of equations, and have variables which may be renamed through the 
use of C preprocessor macros. However, specific programs built on top of 
the API, such as the Physiome Model Repository, do not currently provide 
much flexibility in the output.
>    Can they easily be modified to change the output format?
It is fairly simple to write a different program which uses the CCGS API 
to generate code in your desired format.
>  Are there any special points to note?
There have been several successful uses of CCGS:
1) The CellML Integration Service (CIS), which also comes with the 
CellML DOM API, uses the GNU Scientific Library and the CCGS to run 
simulations. It is used by PCEnv to allow people to run simulations.
2) David Nickerson has used the CCGS (API) to generate code suitable for 
use with SUNDIALS.
3) CCGS comes with a test command-line program, called CellML2C, which 
calls the CCGS API to generate C code.
4) CCGS is used by the Physiome Model Repository Plone product to add a 
'Procedural Code' tab to models in the CellML model repository. There is 
also a separate plone product, called CCGSPlone, which allows users to 
upload their own private CellML models over HTTP, and get procedural 
code back.

It is worth noting that although CCGS supports definite integrals and 
Newton-Raphson solves of equations which CCGS can't use directly, no one 
has yet used this functionality (it is planned for the near future, so 
may be out before your paper gets published. mozCellML used to support 
these two features using its own code generator).
>
> Finally I'll look at optimisation, and my own tools for this.
CCGS performs a minor optimisation, because equations are separated into 
those which need to be run once, and those which must be run after every 
time step, to avoid unnecessary recomputation. It relies on the C 
compiler to perform constant folding optimisations. It cannot optimise 
too aggresively, because it is designed to allow initial values and 
parameters (other than those set through cn elements) to be changed 
without having to recompile.

Best regards,
Andrew

PS: Do you mind if I also send this to the CellML Discussion list, as my 
answers are likely useful to other people as well.



_______________________________________________
cellml-discussion mailing list
[email protected]
http://www.cellml.org/mailman/listinfo/cellml-discussion

Reply via email to