[issue41972] bytes.find consistently hangs in a particular scenario

Tim Peters Sat, 17 Oct 2020 13:52:00 -0700


Tim Peters <[email protected]> added the comment:


When I ran stringbench yesterday (or the day before - don't remember), almost 
all the benefit seemed to come from the "late match, 100 characters" tests.  
Seems similar for your run.  Here are your results for just that batch, 
interleaving the two runs to make it easy to see:  first line from the "before" 
run, second line from the "after" (PR) run, then a blank line.  Lather, rinse, 
repeat.

These are dramatic speedups for the "search forward" cases.  But there _also_ 
seem to be real (but much smaller) benefits for the "search backward" cases, 
which I don't recall seeing when I tried it.  Do you have a guess as to why?

========== late match, 100 characters
bytes   unicode
(in ms) (in ms) ratio%=bytes/unicode*100
2.73    3.88    70.4    s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)
0.17    0.15    116.3   s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100)

2.01    3.54    56.8    s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)
0.89    0.87    101.8   s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100)

1.66    2.36    70.2    s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)
0.15    0.13    111.5   s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100)

2.74    3.89    70.5    s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)
0.17    0.15    112.4   s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100)

3.93    4.00    98.4    s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)
0.30    0.27    108.0   s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100)

3.99    4.59    86.8    s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)
3.13    2.51    124.5   s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100)

1.64    2.23    73.3    s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)
1.54    1.82    84.5    s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100)

3.97    4.59    86.4    s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)
3.18    2.53    125.8   s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100)

4.69    4.67    100.3   s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)
3.37    2.66    126.9   s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100)

4.09    2.82    145.0   s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)
3.39    2.62    129.5   s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100)

3.50    3.51    99.7    s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)
0.30    0.28    106.0   s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100)

Just for contrast, doesn't make much difference for the "late match, two 
characters" tests:

========== late match, two characters
0.44    0.58    76.2    ("AB"*300+"C").find("BC") (*1000)
0.57    0.48    120.2   ("AB"*300+"C").find("BC") (*1000)

0.59    0.73    80.5    ("AB"*300+"CA").find("CA") (*1000)
0.56    0.72    77.5    ("AB"*300+"CA").find("CA") (*1000)

0.55    0.49    112.5   "BC" in ("AB"*300+"C") (*1000)
0.66    0.37    177.7   "BC" in ("AB"*300+"C") (*1000)

0.45    0.58    76.5    ("AB"*300+"C").index("BC") (*1000)
0.57    0.49    116.5   ("AB"*300+"C").index("BC") (*1000)

0.61    0.62    98.6    ("AB"*300+"C").partition("BC") (*1000)
0.72    0.52    137.2   ("AB"*300+"C").partition("BC") (*1000)

0.62    0.64    96.4    ("C"+"AB"*300).rfind("CA") (*1000)
0.49    0.49    101.6   ("C"+"AB"*300).rfind("CA") (*1000)

0.57    0.65    87.5    ("BC"+"AB"*300).rfind("BC") (*1000)
0.51    0.57    89.3    ("BC"+"AB"*300).rfind("BC") (*1000)

0.62    0.64    96.5    ("C"+"AB"*300).rindex("CA") (*1000)
0.50    0.49    101.2   ("C"+"AB"*300).rindex("CA") (*1000)

0.68    0.69    99.0    ("C"+"AB"*300).rpartition("CA") (*1000)
0.61    0.54    113.5   ("C"+"AB"*300).rpartition("CA") (*1000)

0.82    0.60    137.8   ("C"+"AB"*300).rsplit("CA", 1) (*1000)
0.63    0.57    112.0   ("C"+"AB"*300).rsplit("CA", 1) (*1000)

0.63    0.61    103.0   ("AB"*300+"C").split("BC", 1) (*1000)
0.74    0.54    138.2   ("AB"*300+"C").split("BC", 1) (*1000)

----------

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue41972>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue41972] bytes.find consistently hangs in a particular scenario

Reply via email to