If you are interested in doing custom comparisons or looking for differences by functional area,
it turns out that the prosite rules seem to be downloadable , they sent me
a link today:

ftp://ftp.expasy.org/databases/prosite/prosite.dat

I had to ignore their matrix data but their pattern library was easily convertible into PERL ( as far as I have looked, obvious caveats for bugs etc- the canned
c++ regex code I lifted from Microsoft may not be bug free etc) and
it gave me quick graphical and textual compare results on 1000+ rules.
The point here is that you can make your own rules as you read the literature ( that is my plan anyway ) and implement ad hoc splicing or translation schemes
( pretend you want to model flakey ribosomes).
Anyway, I get stuff like this:

Translated rule matches generates rule hit files:

$ $progpath/rules_annotater -clean -which 1 -fastas o2_fasta -xrules $progpath/prosite_rules > pro1

$ $progpath/mm_align_tool -fastas o2_fasta -rules pro0 -rules pro1 -stats
For Rules set 0:>ref|NW_876253.1|Cfa11_WGA39_2:47189155-47195387 Canis familiar
is chromosome 11 genomic contig, whole genome shotgun sequence
97         >rule|13|PEPDTIDE Prosite MICROBODIES_CTER
68         >rule|3|PEPDTIDE Prosite PKC_PHOSPHO_SITE
64         >rule|6|PEPDTIDE Prosite MYRISTYL
47         >rule|4|PEPDTIDE Prosite CK2_PHOSPHO_SITE
46         >rule|11|PEPDTIDE Prosite PRENYLATION
30         >rule|1|PEPDTIDE Prosite ASN_GLYCOSYLATION
10         >rule|2|PEPDTIDE Prosite CAMP_PHOSPHO_SITE
10         >rule|5|PEPDTIDE Prosite TYR_PHOSPHO_SITE
6          >rule|7|PEPDTIDE Prosite AMIDATION
3          >rule|87|PEPDTIDE Prosite LEUCINE_ZIPPER
2          >rule|12|PEPDTIDE Prosite ER_TARGET
1          >rule|1087|PEPDTIDE Prosite THIONIN
1          >rule|973|PEPDTIDE Prosite TUBULIN_B_AUTOREG
For Rules set 1:>gb|AACN010493556.1|:1-1146 Canis familiaris ctg19866850213054,
whole genome shotgun sequence
23         >rule|13|PEPDTIDE Prosite MICROBODIES_CTER
9          >rule|1|PEPDTIDE Prosite ASN_GLYCOSYLATION
8          >rule|11|PEPDTIDE Prosite PRENYLATION
8          >rule|3|PEPDTIDE Prosite PKC_PHOSPHO_SITE
8          >rule|6|PEPDTIDE Prosite MYRISTYL
7          >rule|4|PEPDTIDE Prosite CK2_PHOSPHO_SITE
2          >rule|5|PEPDTIDE Prosite TYR_PHOSPHO_SITE
1          >rule|12|PEPDTIDE Prosite ER_TARGET
1          >rule|7|PEPDTIDE Prosite AMIDATION
1          >rule|87|PEPDTIDE Prosite LEUCINE_ZIPPER


This turned out to be easyto align as the sequences are largely identical ( the lone "G" is
the mismatch in this excerpt ) but you get the idea:

$ $progpath/mm_align_tool -fastas o2_fasta -rules pro0 -rules pro1 -use_rule 13
-align -output text
[...]
Start at 696 and 2373:
         GGCCATTTTGCAACTCATGCATGAGCTACCTTTAGTTCCCCTTCTACATCTGAGAACTGT

         CCCATATAGAATATTTTATAAAACAAGATGGCATTGTGCTAAGTAAAATGCAGAACAAAA
                                    G
         TCAGTATCCCATTAGACATGTCATATTCAGAGTTTATTTTTATCCTTGCACTGAAAGAAT

         GATTGTAAATCAATGGTTTCTTTTTGTTTCTTGACTGTGGCAGTGTTCTGGCTCCAAATG

         ATGGAGATTCCAAATAAGCATTACAGCTTGGCAGGAAATGCCAGTTCAGATATTTGTGAG

         ATCCTAAAGAATAGATCTGGACACATAT

_________________________________________________________________
More photos; more messages; more whatever. Windows Live Hotmail - NOW with 5GB storage. http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_5G_0907

_______________________________________________
General Forum at Bioinformatics.Org - BiO_Bulletin_Board@bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

Reply via email to