On Thu, 29 Jun 2006, Apparao Anakapalli wrote: > pattern = 'ATTTA' > > I want to find the pattern in the sequence and count. > > For instance in 'a' there are two 'ATTTA's.
use re.findall: >>> import re >>> pat = "ATTTA" >>> rexp=re.compile(pat) >>> a = "TCCCTGCGGCGCATGAGTGACTGGCGTATTTAGCCCGTCACATTTA" >>> print len(re.findall(rexp,a)) 2 >>> b = "CCTGCGGCGCATGAGTGACTGGCGTATTTAGCCCGTCACAATTTAA" >>> print len(re.findall(rexp,b)) 2 Be aware, though, that findall finds non-overlapping occurances; and if overlapping occurances are important to you, it will fail: >>> c = "ATTTATTTA" >>> print len(re.findall(rexp,c)) 1 The following method will count all occurances, even if they overlap: def findall_overlap(regex, seq): resultlist=[] pos=0 while True: result = regex.search(seq, pos) if result is None: break resultlist.append(seq[result.start():result.end()]) pos = result.start()+1 return resultlist For example: >>> print len(findall_overlap(rexp,c)) 2 _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor