On 6/22/07, Andrew Miller <[EMAIL PROTECTED]> wrote: > Matt wrote: > >> - Version/Variant > >> It already clogged up the system. There is no proper revision control > >> mechanism, what we have now is an ad-hoc emulated system. > >> > > > > I don't think it has clogged the system I just think it has been > > improperly used both by authors and by the user interface. This is no > > fault of the authors, there is simply a specification for versioning > > that is missing. The hope is that subversion applies well to this. > > > I think that the versioning system itself is the root of the problem, > because it is simultaneously too complicated and too limited. > > In particular: > Branching is inherently a hierarchical process with arbitrary depth, in > the sense that branches can be made from branches to an arbitrary depth. > However, the variant / version system does not really provide the proper > tools to deal with this, because it is limited to two levels (variant > and version) before its utility in tracking what is a derivative of what > is exhausted. > > It is also inadequate because a new model might combine parts of other > models, especially if it is a 1.1 model, and these parts need to be > tracked individually. > > I think that the solution is to simplify down to a single global version > number that is common across the repository or the model (like in > Subversion), and then let either the CellML metadata, or perhaps the > Subversion copy history, describe the way a model has been derived.
Sure, so disregarding variants for now, there is nothing stopping this being implemented with the current versioning/naming convention. There's just no specification for proper use. However I think changesets (as well as global versions) apply well to the notion of a workspace, but I'm not certain about the common practice of trunk/branch roots as applied to cellml - perhaps the best practice would be that every workspace would be the trunk root. > > I see the following workflow as being both simpler and more general... > > John Doe creates a new model directory which has its primary URL at: > http://www.cellml.org/models/id/0ff280ef-dce6-4a42-a275-c9a7d9699096/ > > John now owns this model and is the only one who can change it. John > also gets to decide the visibility of different revisions of the model. This is change the model, or the model + metadata? > > John makes several revisions to the model (each of which bumps the > global revision number). There is a URL by which each historic version > can be referred to. > > John then publishes the model in a journal, referring to it by the > primary URL (or perhaps a short-form if we want to offer authors the > option of assigning one). After the paper is accepted by a peer-reviewed > journal, John updates the metadata on the model. When he commits these > changes, the repository sees this and creates a new alias, e.g. at: > http://www.cellml.org/models/citation/doe_2007_1/ > > John makes some further changes to his model post-publication and > commits them. However, by some mechanism (perhaps by the change > metadata?) the repository knows that this is a change which occurred > post-publication by John. > > Mary notices that there was a discrepancy between the model and John's > published paper (assuming that he didn't reference the CellML model in > the paper). She creates a new primary URL containing a copy of John's > published model, at: > http://www.cellml.org/models/id/281ab697-4607-4fcf-a433-f3ec382fb445/ > She gets John to check this. When John agrees, she updates the metadata > on her model to indicate that her version is a more correct version of > John's paper. The repository then updates so that > http://www.cellml.org/models/citation/doe_2007_1/ is a reference to > John's fixed version. > > John merges in Mary's changes to > http://www.cellml.org/models/id/0ff280ef-dce6-4a42-a275-c9a7d9699096/ > and continues working on more changes. He starts collaborating with > Mary, so he grants her write access to > http://www.cellml.org/models/id/0ff280ef-dce6-4a42-a275-c9a7d9699096/. > > Ming wants to create a derivative of John's paper, so he creates a copy > of the revision referenced from > http://www.cellml.org/models/citation/doe_2007_1/ at > http://www.cellml.org/models/id/7a8996e1-8d05-4a29-a7d8-622d047804fc/ > and starts working on it (marking up the history in the model metadata). > > As you can see, instead of having a confusing mix of variants and > versions (with versions of variants of versions of variants), having a > single revision forces us to look at the metadata instead, which then is > sufficiently general not to have the problems we have seen. Yep, I reckon variants didn't work out at all and the metadata is a better place for this information. > > >> - It's CellML Code, right? > >> Why not put code in a real code management system, like Subversion? > >> > > > > Subversion works well for filesystems of code and text data and to > > some extent binary data that we don't really need to query the > > contents of. If this applies well for CellML modelling, then > > subversion is probably a good match. Subversion will bring its own > > complexities when we are dealing with applying security to file > > objects, > It depends whether or not we actually allow direct access to Subversion > by untrusted users. > A simple approach would be to make everyone go through the front-end > (which might even implement enough methods to let Subversion check out > from there anyway). Yup, that is one way. > > > and security/publishing in general will get even more complex > > if we are proxying remote repositories - which we talked about a few > > weeks ago. > > > > Generally, I think the concept of cellml modelling being laid out in a > > filesystem and subversion versioning concepts applied to it is good, > > but untested. For instance, take a reasonably complex model of Andre's > > and work out how it will look on the filesystem and what subversion > > versioning would result in. > > > I think Andre already has a layout for his model (with relative URLs). > Letting the author decide what it looks like is probably a good first step. > > While in this thread, I don't believe metadata should be treated any > > differently to model data. Adding special rules for versioning of some > > data and not others is going to complicate the versioning process and > > I can't see any compelling reason to do this. > I agree (for metadata about the model at least. Permissions etc... are a > special case of course). > > Remember that the > > subversion system is versioning file objects which will contain both > > metadata and cellml model data. What is important is how and where > > metadata is stored. Perhaps metadata should be seperated into its own > > document sitting next to the model in the filesystem. > > > Model is a confusing word because CellML 1.1 models can combine several > models to make one mathematical model. There is a case for metadata / > manifest about the mathematical model as well as metadata about each the > CellML models that make up the mathematical model. > > My inclination is that an implementation using subversion plus some > > subversion hooks will be ok, but we haven't worked out details or done > > any proof of concept for this - which should be agnositic to cellml > > > This would have the benefit of supporting non-CellML models, although it > means that we have to change the CellML models if we are going to > include RDF/XML serialisations inside them. > > Perhaps a generic framework with some XML with embedded RDF specific > parts slotted into it would be better. > > > and focussed on how to apply zope+cmf security and workflows to data > > objects stored in subversion repositories. > > > If we are going to be doing a major re-write, now is the time to > consider if we should be using Zope, or if we want to proxy this part of > the site to some other technology (I think that the decision the first > time was not discussed at CellML meetings at all, and has had a lot of > unfortunate consequences, so I don't think it is completely out of the > question to reconsider technologies. The fact that we are already using > it probably carries some weight in the decision, but other factors might > be enough to tip the balance in another direction). Yep. It doesn't have to be Zope at all. It provides a reasonable foundation though, like others in this space. Others on the block could be pylons, Ruby on Rails, Zope 3, maybe just apache+cgi (but that would be a pretty big rewrite). > > > >> - Zope has revision control > >> Until someone packs the database. > >> > > > > Perhaps you should look at http://plone.org/products/plone/roadmap/8 > > (which is now completed and merged into Plone 3). There are some other > > add on products - some listed in > > http://plone.org/products/by-category/versioning-staging > > > > > > > >> - Zope/Plone is also quite slow. > >> > > > > Really? How so? > > > I think an interpreted language, even a byte-compiled one, will always > be slow, and all the layers of abstraction from Zope and Plone probably > make this worse. However, I'm not sure that it is the bottleneck for the > majority of users given the recent thread about network speeds. Yeah. I haven't seen a bottleneck according to Zope/Plone being identified yet. > > > >> - Code we have now cannot get away from original design flaws. Might as > >> well start from scratch. > >> > > > > Refactoring may achieve the outcome better. > > > I agree that this will be better in general (throwing away everything is > probably a bit drastic, I am sure that there are some parts of the code > that are still usable). Of course, if we move off Python this might be > the only option, so we should keep an open mind but be wary of the costs > of doing so. Any opinions of other environments you would consider? They should probably go into the mix now. > > Best regards, > Andrew > > _______________________________________________ > cellml-discussion mailing list > [email protected] > http://www.cellml.org/mailman/listinfo/cellml-discussion > _______________________________________________ cellml-discussion mailing list [email protected] http://www.cellml.org/mailman/listinfo/cellml-discussion
