As I understand it, the query sequence is always first in the alignment (allhits, query.JAL and the template maps) Is that correct ?
If so, we could add logic to Jalview that defaults to adding annotation to the first sequence in the alignment if 'QUERY' is given as the name, but no sequence is given with that name. Alternately, we have precedent for mapping annotation to sequences by position in the alignment. Much more fragile but equally valid in this case. J. Sent from my Cyanogen phone On Apr 4, 2017 1:08 PM, "Charles Ofoegbu (Staff)" <[email protected]> wrote: Hi Lawrence, Thank you for the reply. I think I am more concerned with having a consistent name for the query sequence across the essential artifacts processed by Jalview (allhits.fasta, pair-wise mapping fasta files and Jalview annotation file). I don't mind if it is named "QUERY" or anything else - as far as it is persistent across board. Currently, if a user submits a job to Phyre2 without providing a job description, the query sequence is named 'undefined' in the generated result artifacts (input.fasta, allhits.fasta, pair-wise mapping fasta files e.t.c). On the other hand, if a job description is provided, the query sequence is named using the provided description in the generated artifacts. Nonetheless, it is always named “QUERY" in the generated Jalview annotation file. Do you foresee any challenge in naming the query sequence consistently across the artifacts? Regards, Charles Ofoegbu Tochukwu Charles Jalview Visual Analytics Developer/Scientist The Barton Group Division of Computational Biology School of Life Sciences University of Dundee, Dundee, Scotland, UK. Skype: cofoegbu www.jalview.org<http://www.jalview.org/> www.compbio.dundee.ac.uk<http://www.compbio.dundee.ac.uk/> On 4 Apr 2017, at 09:35, Lawrence Kelley <[email protected]<mailto:[email protected]>> wrote: Hi Charles et al., This looks fine apart from one issue: "Replace the name “QUERY” in STRUCTMODEL annotations with the actual name of the Query Sequence" Input sequences come from users. So firstly, I don't force them to name their sequence. Secondly, even if I did it would be a nightmare. So the sequences essentially don't have true names, hence the use of QUERY. Does this pose a problem? I can always generate an MD5 semi-unique key etc. Thoughts? Lawrence On Thu, Mar 30, 2017 at 11:47 AM, Charles Ofoegbu (Staff) <[email protected]<mailto:[email protected]>> wrote: Hi Lawrence, Hope this email meets you well. We have made some progress with the Jalview-phyre2 integration. As you are aware, the following use cases were identified in the Jalview-Phyre2 collaboration document: 1. Phyre2 users wanting to use Jalview 2. Jalview users wanting to use Phyre2 I have pushed an initial prototype for the first use case to Jalview's Phyre2 spike branch. A brief summary of the features enabled is outlined further below. There are some other features that would be best demonstrated. Accordingly, I think it is now time for us to have a look at it together via Skype when Jim is back (Jim is away this week and should be back on Monday). This will also enable us to obtain very useful feedback from you. The current state of the implementation provides a good foundation for the second use case, which is yet to be implemented. I hope to commence implementation for Phyre2 Job submission and management from Jalview very soon. Once these are completed, it should plugin easily into the current implementation in such a way as to realise the second use case. Furthermore, there are some changes required on Phyre2 server end, and I have summarised them under change request section below. Features Enabled in the Current Prototype * Support for loading Phyre2 result alignment with custom Jalview annotation file generated by Phyre2 * Detailed interactive sequence/structure display * Linking of the template sequences to their 3D structure models on importing the annotation file * Structure <--> Sequence mapping using the original fasta pair-wise alignment file generated by Phyre2 * Superposing, viewing and comparing two or more modelled structured * Transfer of features and annotations from the modelled 3D structures onto their template sequence when viewed Change Request Changes to Phyre2 server 1. The outset alignment file to be opened by the generated launchapp.jnlp Jalview file should be changed from “query.jal” to “allhits.fasta” 2. For the sake of consistency, “query.jal.ann” should be rename to “allhits.jal.ann” since the “allhits.fasta” alignment file is made the outset input alignment. 3. Drop custom mapping files specifically generated for Jalview (i.e “xxx.fasta.jal”) Changes to the generated Jalview annotation file 1. Modify the file accordingly to reference the original Phyre2 fasta mapping files generated rather than the custom ones stated in point three above 2. Replace the name “QUERY” in STRUCTMODEL annotations with the actual name of the Query Sequence 3. Make enhancement to the file to incorporate HEADER_STRUCT_MODEL annotation. See notes below: HEADER_STRUCT_MODEL annotation: * The HEADER_STRUCT_MODEL annotation has been introduced to enable dynamic definition of the data columns for the STRUCTMODEL annotation. * The HEADER_STRUCT_MODEL should be tab-delimited just like STRUCTMODEL * The HEADER_STRUCT_MODEL should be declare once, and MUST be declared before the first instance of STRUCTMODEL * Each column of the HEADER_STRUCT_MODEL defines the corresponding columns in the STRUCTMODEL annotations * The first four columns of HEADER_STRUCT_MODEL (QUERY_SEQ, TEMPLATE_SEQ, MODEL_FILE, MAPPING_FILE) are compulsory and MUST be provided in the given order. * Data in STRUCT_MODEL can be formatted as HTML (as long as it contains no tab character) * Current implementation in Jalview enables backward compatibility - older “.jal.ann” annotation files generated by Phyre2 server can still be processed. * Infinite number of additional meta-data column can subsequently be included in any order. For instance the meta-data from the “crudlist" should be added to the annotation file as follows: HEADER_STRUCT_MODEL QUERY_SEQ TEMPLATE_SEQ MODEL_FILE MAPPING_FILE Confidence % I.D Aligned Range Other Information STRUCTMODEL FER_CAPAN c4n58A_ c4n58A_.1.pdb c4n58A_.1.fasta 1 54 48-143 <b>PDB Header: </b>Hyrolase<br><b>Chain: </b>A STRUCTMODEL FER_CAPAN d1a70a_ d1a70a_.2.pdb d1a70a_.2.fasta 1 71 48-144 <b>Fold: </b>Beta-Grasp (ubiquitin-like) Please let me know if you find anything unclear regarding the requested changes. Apologies for the lengthy mail! Best regards, Charles Ofoegbu Tochukwu Charles Jalview Visual Analytics Developer/Scientist The Barton Group Division of Computational Biology School of Life Sciences University of Dundee, Dundee, Scotland, UK. Skype: cofoegbu www.jalview.org<http://www.jalview.org/> www.compbio.dundee.ac.uk<http://www.compbio.dundee.ac.uk/> The University of Dundee is a registered Scottish Charity, No: SC015096 -- Dr. Lawrence Kelley Structural Bioinformatics Group Dept. Life Sciences Imperial College London [email protected]<mailto:[email protected]> www.sbg.bio.ic.ac.uk/people/kelley<http://www.sbg.bio.ic.ac.uk/people/kelley/> Phyre2 Lead Developer www.imperial.ac.uk/phyre2<http://www.imperial.ac.uk/phyre2> Twitter for server news twitter.com/phyre2server<http://twitter.com/phyre2server> The University of Dundee is a registered Scottish Charity, No: SC015096 The University of Dundee is a registered Scottish Charity, No: SC015096
_______________________________________________ Jalview-dev mailing list [email protected] http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev
