Hi Natasha.
On Wed Dec 5 13:54:56 2012, Nataliya Sherstneva wrote:
> I've done output for a case like this:
>
>
> #=GS DR Uniprot; O31698
>
ok. that should be fine.
> Could you please clarify about Pham AC numbers. So if I recognized a
> file keeps alignment from Pfam, where protein IDs from the file should
> go, to DBRefEntry or anywhere else?
All accesions should go into a DBRefEntry - Jalview does some
normalisation, to remove redundant accessions, but it needs looking at
to make sure (basically, everything works on the version number in the
DBRefEntry).
>
> In some files in Stockholm Format there is a line like this
> #=GF AC PF...
> so, I know that this is the Pham database. It's clear for me.
>
> Also I've noticed, that when we generate file in Pham in Stockholm
> format, this file doesn't have a line like
> #=GF AC PF...
The reason for this is that the =GF tag is for *alignment wide*
annotations. In the Pfam parser, you'll see that these annotations are
added to the AlignmentI object via the 'setAlignmentProperty(..,..)'
method. For the 'print()' method, you should first generate =GF type
annotation from the key value pairs obtained from
AlignmentI.getProperty().
>
> so in this case, should I analyze AC from this:
>
> #=GS A7IWM5_PBCVN/3-170 AC A7IWM5.1
>
> by using regex like [A-Z][0-9][A-Z0-9]{4} .?
Yes. Except you'll also want to capture the '1' after the '.' as the
version number (there's a field in DBRefEntry for this). However,
that's only the case for Pfam. In the case of Rfam it will very
definitely depend on where the accession has come from (sometimes,
database acccessions include '.1' or '.2' to indicate the first or
second processed product, or to refer to a subpart of the parent
accession).
With regard to testing the generation/import of these GS AC records -
I'm not sure how useful it is to regenerate these bare GS AC records.
They are only really valid in the context of the parent alignment's own
accession code (e.g. PF or RF), so jalview should only generate them if
they came from an alignment with a PF/RF source.
something to ponder over the weekend...
Jim.
_______________________________________________
Jalview-dev mailing list
[email protected]
http://www.compbio.dundee.ac.uk/mailman/listinfo/jalview-dev