> I wrote my previous message before reading this. Thank you for the test you
> ran -- it answers the question of performance. You show that re.finditer is
> 30x faster, so that certainly recommends that over a simple loop, which
> introduces looping overhead.
>> def using_simple_loop(key, text):
>> matches = []
>> for i in range(len(text)):
>> if text[i:].startswith(key):
>> matches.append((i, i + len(key)))
>> return matches
>>
>> using_simple_loop: [0.13952950000020792, 0.13063130000000456,
>> 0.12803450000001249, 0.13186180000002423, 0.13084610000032626]
>> using_re_finditer: [0.003861400000005233, 0.004061900000124297,
>> 0.003478999999970256, 0.003413100000216218, 0.0037320000001273]
With a slight tweak to the simple loop code using .find() it becomes a third
faster than the RE version though.
def using_simple_loop2(key, text):
matches = []
keyLen = len(key)
start = 0
while (foundSpot := text.find(key, start)) > -1:
start = foundSpot + keyLen
matches.append((foundSpot, start))
return matches
using_simple_loop: [0.1732664997689426, 0.1601669997908175,
0.15792609984055161, 0.1573973000049591, 0.15759290009737015]
using_re_finditer: [0.003412699792534113, 0.0032823001965880394,
0.0033694999292492867, 0.003354900050908327, 0.0033336998894810677]
using_simple_loop2: [0.00256159994751215, 0.0025471001863479614,
0.0025424999184906483, 0.0025831996463239193, 0.0025555999018251896]
--
https://mail.python.org/mailman/listinfo/python-list