Hi David. I've cced this to the development list, so everyone can read about what you've been up to.
On Sat Jan 11 12:00:37 2014, David Roldán Martínez wrote: > I've been working on this, taking a look at Stockholm and Embl > processes but I cannot figure out which to do with the information, > once I have loaded SNP file and GenBank information into my own > classes hierarchy. hmm. Just a comment here: you really should avoid setting up a class hierarchy if you can avoid it - the parsing overhead from creating lots of objects is quite considerable. Jalview has a quite extensive annotation datamodel, which will work for non-hierarchical sequence features - but not for more complex compound/hierarchical features (http://issues.jalview.org/browse/JAL-1191). For the full gory details of this type of annotation, you need to read the documentation here: http://www.insdc.org/files/feature_table.html#2 however - don't start hacking on this until we talk - there are some very good examples of how to implement complex/compound feature datamodels, and I'd prefer it if we first analyse those and work out which one fits Jalview's needs best. > _SNP loading > I've been able to set up Castor Maven plugin so that I can generate a > Java library only customizing a pom.xml and to include a XSD (or set > of XSD). In this way, I think we'll be able to widen Jalview data load > from multiple sources quite easily. I can work on that (just tell me > the XSD) but I really need to fully understand Jalview datamodel. An > E-R diagram or similar will be useful in this sense. ;-) eek! We already have the castor source generation machinery bundled with Jalview. By using a maven plugin, you risk breaking compatibility with the bundled version of castor, which is NOT good. If you must use castor XSD->Java, then take a look at the 'castorbinding' task in build.xml - this already includes a set of XSDs that create java bindings for the Jalview project and colourscheme files which are critical for the jalview desktop. You should also bear in mind that currrently, fileformats dependent on classes autogenerated with castor will not be available in the applet, since XML parsing libraries are considered too heavyweight to ship to the browser. This is the most significant reason for not using XSD->Java object mapping, but there are other reasons for avoiding it: e.g. when working with large XML files, stream XML processing avoids the memory and object creation overhead incurred by creating an object representation of elements in the document. Re understanding Jalview's datamodel.. I know an ER diagram would help, but it will only get you so far, since you also need to think about how the data that you are trying to import into Jalview is structured (remember, the XML format may not necessarily correspond to the way that the data might be most usefully be handled in Jalview). > _GenBank > _ > I have parsed the file to get sequences and features. In this version > of the patch (not the one attached at JIRA) I think I can translate > sequences from file to Jalview sequences (please, check) but I don't > know what happens with file headers and features. How can I inject > this into Jalview datamodel? Which is the correspondence between them? We are going to talk through this on our next google hangout. > _Integration framework > > _ > I've been thinking about how to integrate Jalview with other tools and > systems. At e-learning domain there are several interesting > initiatives whose approximations are worth to be examined. > Take a look at this two: JISC E-learning Framework > (http://www.jisc.ac.uk/whatwedo/programmes/elearningframework.aspx) > and OKI (http://en.wikipedia.org/wiki/Open_Knowledge_Initiative). Both > of them are based on the concept of service and service interfaces but > don't force to use any particular implementation. This offer better > interoperability between platforms and this is a good change to make > tools adoption to grow. I'm going to work on this idea with a > colleague, trying to put this ideas in a paper to see if it's accepted > at RCIS (http://www.rcis-conf.com/rcis2014/). If you are interested at > participating with us, let me know and I'll him. > If you think this is a good idea, probably we can discuss this in > detail and even open the discussion to more people. You are quite right in recognising that Jalview would benefit from being part of an information integration framework. In fact, Jalview already includes a couple. VAMSAS is a prototype data and application integration framework for bioinformatics data that I developed in collaboration with some other groups. DAS is a much more widespread data integration framework based on XML/REST services that has been around since 2001 (http://www.biodas.org/wiki/Main_Page). It was developed for sequences and sequence features on genomes, but has been adapted to work with other types of data. As you might imagine, I'm interested in integration frameworks, and would be interested discussing ideas with you colleague, though I should say now that I already have enough deadlines for this year! > _i18n management > _ > I was wondering if it is possible to create to separate components at > JIRA, one for bugs/FR/etc. related with i18n and other one for > translations. In this way, if the issue is, for example, a mistake in > property bundle or a new language contribution, we'll marked them as > Translation related. On the contrary, if the issue is something that > doesn't work when you switch the language from English to French, > we'll specify it as Internationalization related. What do you think > about that? Done. Translations component is here. http://issues.jalview.org/browse/JAL/component/10780 Jim. _______________________________________________ Jalview-dev mailing list [email protected] http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev
