My situation:
I have a list of numbers that I have to match in another list and write them to a new file:
List 1: range_cors
range_cors[1:5]
['161:378', '334:3', '334:4', '65:436']
List 2: seq
seq[0:2]
['>probe:HG-U133A_2:1007_s_at:416:177; Interrogation_Position=3330; Antisense;', 'CACCCAGCTGGTCCTGTGGATGGGA']
A slow method:
sequences = [] for elem1 in range_cors:
for index,elem2 in enumerate(seq): if elem1 in elem2: sequences.append(elem2) sequences.append(seq[index+1])
This process is very slow and it is taking a lot of time. I am not happy.
It looks like you really only want to search every other element of seq. You could speed your loop up by using an explicit iterator:
for elem1 in range_cors:
i = iter(seq)
try:
tag, data = i.next(), i.next()
if elem1 in tag:
sequences.append(tag)
sequences.append(data)
except StopIteration:
pass
You don't say how long the sequences are. If range_cors is short enough you can use a single regex to do the search. (I don't actually know how short range_cors has to be or how this will break down if it is too long; this will probably work with 100 items in range_cors; it may only be limited by available memory; it may become slow to compile the regex when range_cors gets too big...) This will eliminate your outer loop entirely and I expect a substantial speedup. The code would look like this:
>>> range_cors = ['161:378', '334:3', '334:4', '65:436']
Make a pattern by escaping special characters in the search string, and joining them with '|': >>> pat = '|'.join(map(re.escape, range_cors)) >>> pat '161\\:378|334\\:3|334\\:4|65\\:436' >>> pat = re.compile(pat)
Now you can use pat.search() to find matches:
>>> pat.search('123:456')
>>> pat.search('aaa161:378')
<_sre.SRE_Match object at 0x008DC8E0>The complete search loop would look like this:
i = iter(seq)
try:
tag, data = i.next(), i.next()
if pat.search(tag):
sequences.append(tag)
sequences.append(data)
except StopIteration:
passKent
A faster method (probably):
for i in range(len(range_cors)):
for index,m in enumerate(seq): pat = re.compile(i) if re.search(pat,seq[m]): p.append(seq[m]) p.append(seq[index+1])
I am getting errors, because I am trying to create an
element as a pattern in re.compile().
Questions:
1. Is it possible to do this. If so, how can I do
this.
Can any one help correcting my piece of code and
suggesting where I went wrong.
Thank you in advance.
-K
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor
_______________________________________________ Tutor maillist - [email protected] http://mail.python.org/mailman/listinfo/tutor
