Hi Randy, Just to keep the discussion in on place, I include my earlier reply...
Randall Owen wrote: > One topic that seemed to come up in a lot during the recent CellML Workshop > in Auckland was model validation, curation, and composition. This caused me > to think about a couple of questions about which I would really appreciate > the CellML community's thoughts: > > 1. What does validation mean to the CellML community? from what I have seen there are several types (?) of validation: (1) straight XML validation of a CellML document with the CellML XML DTD or XML schema - probably the best first check but one which has never been done for all the models in the model repository. (2) checking that a CellML document meets all the requirements of the specification - some checks that can't be enforced using standard XML validation. Not too sure what can and can't be done nowadays, but I think this is kind of where the RELAXNG schema etc. that Jonathan was working on fits in. (3) Units (dimensional) correctness - the CellML specification states that compliant applications must handle unit conversions at the CellML connection level. There is no requirement for checking within the mathematics, but obviously a valid model should be thoroughly unit balanced. so I think thats about what you can do in terms of validating a given CellML document. Then you start to look at validating the mathematical model encoding... (4) given a set of parameter values, boundary conditions, integration parameters, do simulations using the model accurately reproduce some reference data set? How close do you need to be to the reference data to say that the model is valid? Hopefully the results wouldn't vary too much using any suitable integration method, but different solvers and different implementations of solvers and different platforms will almost certainly result in slight numerical differences... and thats pretty much enough to say that a CellML encoding of a model is valid, I think, in terms of encoding a published model and showing that it reproduces the same range of behavior as the publication of that model. There is still the issue of valid unpublished models... Sitting on top of this is then all the domain specific curation ideas that we discussed, which is really a broader issue than just CellML. So far we have really only been considering CellML models based on published models, so having the publication go through the review process is kinda doing this sort of thing. > 2. What does decomposition and composition mean to the CellMl community? > 3. What does re-usability mean to the CellML community? I think the big issue here is that the CellML community hasn't really considered these ideas in any detail. We worked hard to get the mechanisms in place in the XML language and API implementation to allow these features, but so far no one is really using them - and those that are are not really using them fully. To me, decomposition means taking a CellML 1.0 model and separating out the components into a CellML 1.1 model hierarchy. It could be taken a step further and reduce the components down to containing just one equation and then encapsulating these "sub-components" appropriately to provide the same interface as the original single component. And then composition is taking all these small sub-models and assembling them into larger, more complex, models. Which may then be used further to build larger models, and so on... As for re-usability, again, in my view, implies that you have some CellML model that can be re-used :-) Following the ideas of my "best practices" this would be a model which contains just math with no parameter values or initial conditions embedded in it such that I can take that math and use it in the way I want without restriction. Practically, there is still use in being able to take models which do have parameters and initial conditions embedded in them and plugging them into another model. > Each of the above have very specific meanings within the software engineering > and computing communities. The other interesting aspect that I have briefly talked to Steve McKeever about is the principle of substitutability - something I'm still keen to get back to looking at. Mainly from an interest in cardiac electromechanical models where you want to be able to plug different electrophysiology or mechanics models together at both the cellular and tissue spatial scales. And how you can define standard interfaces such that different models can easily be substituted...one day I'll get back to that... David. _______________________________________________ cellml-discussion mailing list [email protected] http://www.cellml.org/mailman/listinfo/cellml-discussion
