[Tutor] how to do systematic searching in dictionary and printing it

Srinivas Iyyer Thu, 20 Oct 2005 09:34:14 -0700

dear group,


I have two files in a text format and look this way:


File a1.txt:
>a1
TTAATTGGAACA
>a2
AGGACAAGGATA
>a3
TTAAGGAACAAA



File b1.txt:
>b1
TTAATTGGAACA
>b2
AGGTCAAGGATA
>b3
AAGGCCAATTAA


I want to check if there are common elements based on
ATGC sequences. a1 and b1 are identical sequences and
I want to select them and print the headers (starting
with > symbol). 

a1 '\t' b1



Here:
>XXXXX is called header and the line followed by >line
is sequence. In bioinformatics, this is called a FASTA
format.  What I am doing here is, I am matching the
sequences (these are always 25 mers in this instance)
and if they match, I am asking python to write the
header +'\t'+ header


ak = a[1::2]
av = a[::2]
seq_dict = dict(zip(ak,av))

**************************************
>>>seq_dict
{'TTAAGGAACAAA': '>a3', 'AGGACAAGGATA': '>a2',
'TTAATTGGAACA': '>a1'}
**************************************



bv = b[1::2]  

***************************************
>>>bv
['TTAATTGGAACA', 'AGGTCAAGGATA', 'AAGGCCAATTAA']


>>>for i in bv:
        if seq_dict.has_key(i):
                print seq_dict[i]

                
>a1

***************************************

Here a1 is the only common element.

However, I am having difficulty printing that b1 is
identical to a1


how do i take b and do this search. It was easy for me
to take the sequence part by doing

b[1::2]. however, I want to print b1 header has same
sequence as a1

a1 +'\t'+b1

Is there anyway i can do this. This is very simple and
due to my brain block, I am unable to get it out. 
Can any one please help me out. 

Thanks



        
                
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

[Tutor] how to do systematic searching in dictionary and printing it

Reply via email to