Re: [EMBOSS] Help to build a motif for fzzpro

2012-10-19 Thread Dr. Josef Maier - IStLS



Hello Angus,


you could use preg with following pattern as written in regular expressions: 
([KRH][^DE][^DE][^DE])|([^DE][KRH][^DE][^DE])|([^DE][^DE][KRH][^DE])|([^DE][^DE][^DE][KRH])


Alternatively you could search with four different PROSITE-style patterns using fuzzpro and 
combine the result tables:
[KRH]{DE}(3)
{DE}(1)[KRH]{DE}(2)
{DE}(2)[KRH]{DE}(1)
[KRH]{DE}(3)


For searching with combinations of PROSITE patterns, amino acid compositions and 
eventually AAINDEX profiles we had made a free web application for the University of Oslo, 
the SAPA tool:
http://sapa-tool.uio.no/sapa/index.php


E.g. searching a 4-letter subsequence with the PROSITE-style patterns [KRH].{DE}(4), 
where the dot operator means logical AND, will produce a list of all subsequences having 
the two patterns in that application.


Maybe the possibility to combine more than one PROSITE-style pattern within a fuzzpro 
search with logical AND would be a useful extension for fuzzpro improvement. Often more 
than one pattern is given for a domain or functional site in the PROSITE pattern database. 
Of course preg will do the job, however, the PROSITE patterns have to be rewritten as 
regular expressions.


Best regards,


Josef


Josef Maier, IStLS Information Services to Life Science



 Date: Thu, 18 Oct 2012 15:52:44 +0100
 From: Aengus Stewart aengus.stew...@cancer.org.uk
 Subject: [EMBOSS] Help to build a motif for fzzpro
 To: emboss@lists.open-bio.org
 Message-ID: 508017bc.80...@cancer.org.uk
 Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 
 Hi all,
 
 I have been asked to build a motif and the first part of the motif
 is.
 
 The first 4 AAs must not be acidic and 1 of them should be basic
 
 I am not sure how to do this. So far I have
 
 {DE}(4)
 
 I am thinking I run with this and then post-filter the fuzzpro file?
 
 
 Any ideas greatfully accepted.
 
 
 Cheers
 Aengus
 
 
 NOTICE AND DISCLAIMER
 This e-mail (including any attachments) is intended for the
 above-named person(s). If you are not the intended recipient, notify
 the sender immediately, delete this email from your system and do not
 disclose or use for any purpose. 
 
 We may monitor all incoming and outgoing emails in line with current
 legislation. We have taken steps to ensure that this email and
 attachments are free from any virus, but it remains your
 responsibility to ensure that viruses do not adversely affect you.
 Cancer Research UK Registered charity in England and Wales (1089464),
 Scotland (SC041666) and the Isle of Man (1103) A company limited by
 guarantee. Registered company in England and Wales (4325234) and the
 Isle of Man (5713F). Registered Office Address: Angel Building, 407 St
 John Street, London EC1V 4AD.



___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] Help to build a motif for fzzpro

2012-10-19 Thread Guy Bottu

Dear Aengus,

I have an idea of how to do it. You must of course complete the 
motif/pattern as much as possible because with just {DE}(4) you will 
find much to much unless you search only a few sequences. You must run 
the program fuzzpro with parameter -rformat=listfile. You will obtain as 
output an EMBOSS list file. You can then run fuzzpro again with as 
sequence input list::xxx (xxx the name of your file) and as pattern 
input @yyy where yyy contains :


 basic_1
[HKR].
 basic_2
x-[HKR].
 basic_3
x-x-[HKR].
 basic_4
x-x-x-[HKR].

Indeed, the input will contain only sequence fragments matching the 
pattern and the basic amino acid hence should be in one of the first 4 
positions.


Regards,
Guy Bottu,
ex-collaborator of the Belgian EMBnet Node
___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss


Re: [EMBOSS] Help to build a motif for fzzpro

2012-10-19 Thread Peter Rice

On 19/10/2012 10:33, Dr. Josef Maier - IStLS wrote:

Hello Angus,

you could use preg with following pattern as written in regular
expressions:
([KRH][^DE][^DE][^DE])|([^DE][KRH][^DE][^DE])|([^DE][^DE][KRH][^DE])|([^DE][^DE][^DE][KRH])

Alternatively you could search with four different PROSITE-style
patterns using fuzzpro and combine the result tables:
[KRH]{DE}(3)
{DE}(1)[KRH]{DE}(2)
{DE}(2)[KRH]{DE}(1)
[KRH]{DE}(3)


You can also put the four patterns in a file and use the syntax -pattern 
@patternfile


% cat patternfile
first
[KRH]{DE}(3)
second
{DE}(1)[KRH]{DE}(2)
third
{DE}(2)[KRH]{DE}(1)
fourth
[KRH]{DE}(3)


For searching with combinations of PROSITE patterns, amino acid
compositions and eventually AAINDEX profiles we had made a free web
application for the University of Oslo, the SAPA tool:
http://sapa-tool.uio.no/sapa/index.php


Interesting. I'll take a look.


E.g. searching a 4-letter subsequence with the PROSITE-style patterns
[KRH].{DE}(4), where the dot operator means logical AND, will produce
a list of all subsequences having the two patterns in that application.

Maybe the possibility to combine more than one PROSITE-style pattern
within a fuzzpro search with logical AND would be a useful extension for
fuzzpro improvement. Often more than one pattern is given for a domain
or functional site in the PROSITE pattern database. Of course preg will
do the job, however, the PROSITE patterns have to be rewritten as
regular expressions.


We also have a long standing offer to revive scrutineer, written by 
Peter Sibbald at EMBL some years ago but it would need translation from 
Pascal (not too hard to do). It loaded SwissProt into memory and had 
interesting ways to search for motif patterns.


regards,

Peter Rice
EMBOSS Team

___
EMBOSS mailing list
EMBOSS@lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/emboss