On Mon, 30 Jul 2001, Thomas Down wrote: > > I've just been taking a look at the bioperl-db schema, and > it's certainly worth a good look by anyone who's interested > in this project. > > I would say that it's quite strongly tied to BioPerl though, > or at least the BioPerl way of looking at things. We should > look quite carefully at what the requirements are for persistance > in BioJava. For instance, a Java-centric schema could get > away with tricks like serializing any datatypes it didn't > explicitly understand (I'm thinking particularly of > Annotation-bundle data here). That sort of thing could probably > be piggy-backed onto the BPDB schema as an `optional extra' > without too much trouble. There are similar things on the Bioperl side as well -- ;) I am all for a series of DDL files of increasing complications to allow the schema to grow in complexity around a stable core, and not against "shove it in" methods of raw language serialisation and/or hacky tag-value sets (xml-styleee) for the more "just store this object" type problems. Of course, Biojava and Bioperl wont interoperate on these data types, but we will be able to let each project represent all its bells and whistles but having a common core. > > A rather bigger problem is hierarchical features, which I'd > say were quite important if we're aiming for `persistant > BioJava' rather than a more general database system. This > definitely does mean a new schema. And quite possibly > stored proceedures on the server (or something similar) to > keep the performance good -- at least given my past experiences > with hierarchical data in SQL. > Bioperl has a half-used heirerachical scheme for features which is now in limbo due to our split locations. In other words, if biojava wanted to add heirarchy into the schema I would be fine too see that happen and could provide mappings to bioperl. (cue rambling discussion about whether complex heirarchies of features are a good thing or not or whether they should be represented as separate objects - take for example, "intron features" which one can derived from exon features and therefore don't want to duplicate inside the data storage but do want to expose programmatically. Aaaaaah. The sweet smell of a complex design decision for us to chew over) > Anyway, sorry for rambling on. I think the point I'd like to > make is that there are two slightly different problems here: > > - A general, lightweight, database mechanism which can > be shared between different projects. BPDB looks like > a reasonable schema for this sort of thing. > Sure > - A system tuned for a particular object model, trying to > get as close to that model as posssible. This should give > `persistant objects' which behave extremely closely to, > for example, the normal in-memory BioJava Sequences. I am going to stick my neck out and say with a minimum amount of give and take, Bioperl and Biojava can map to the same relational data model for both their object models and furthermore this is a "good thing" to keep to the two projects from drifting. I'm happy to be flexible here. After all, there is more than one way to do it! Let's see how far we can get before we have to get the boxing gloves out.... > > It's worth being clear about which of these is being addressed > before making too many commitments here. > Shall we see how far a common schema can take us? I wont force it if it wont go, but it is worth making the effort to stay on the same rough data model - I think it would benefit both communities ewan ----------------------------------------------------------------- Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420 <[EMAIL PROTECTED]>. ----------------------------------------------------------------- _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l