Tim Peters <t...@python.org> added the comment:
When I ran stringbench yesterday (or the day before - don't remember), almost all the benefit seemed to come from the "late match, 100 characters" tests. Seems similar for your run. Here are your results for just that batch, interleaving the two runs to make it easy to see: first line from the "before" run, second line from the "after" (PR) run, then a blank line. Lather, rinse, repeat. These are dramatic speedups for the "search forward" cases. But there _also_ seem to be real (but much smaller) benefits for the "search backward" cases, which I don't recall seeing when I tried it. Do you have a guess as to why? ========== late match, 100 characters bytes unicode (in ms) (in ms) ratio%=bytes/unicode*100 2.73 3.88 70.4 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100) 0.17 0.15 116.3 s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E") (*100) 2.01 3.54 56.8 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100) 0.89 0.87 101.8 s="ABC"*33; ((s+"D")*500+"E"+s).find("E"+s) (*100) 1.66 2.36 70.2 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100) 0.15 0.13 111.5 s="ABC"*33; (s+"E") in ((s+"D")*300+s+"E") (*100) 2.74 3.89 70.5 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100) 0.17 0.15 112.4 s="ABC"*33; ((s+"D")*500+s+"E").index(s+"E") (*100) 3.93 4.00 98.4 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100) 0.30 0.27 108.0 s="ABC"*33; ((s+"D")*500+s+"E").partition(s+"E") (*100) 3.99 4.59 86.8 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100) 3.13 2.51 124.5 s="ABC"*33; ("E"+s+("D"+s)*500).rfind("E"+s) (*100) 1.64 2.23 73.3 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100) 1.54 1.82 84.5 s="ABC"*33; (s+"E"+("D"+s)*500).rfind(s+"E") (*100) 3.97 4.59 86.4 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100) 3.18 2.53 125.8 s="ABC"*33; ("E"+s+("D"+s)*500).rindex("E"+s) (*100) 4.69 4.67 100.3 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100) 3.37 2.66 126.9 s="ABC"*33; ("E"+s+("D"+s)*500).rpartition("E"+s) (*100) 4.09 2.82 145.0 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100) 3.39 2.62 129.5 s="ABC"*33; ("E"+s+("D"+s)*500).rsplit("E"+s, 1) (*100) 3.50 3.51 99.7 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100) 0.30 0.28 106.0 s="ABC"*33; ((s+"D")*500+s+"E").split(s+"E", 1) (*100) Just for contrast, doesn't make much difference for the "late match, two characters" tests: ========== late match, two characters 0.44 0.58 76.2 ("AB"*300+"C").find("BC") (*1000) 0.57 0.48 120.2 ("AB"*300+"C").find("BC") (*1000) 0.59 0.73 80.5 ("AB"*300+"CA").find("CA") (*1000) 0.56 0.72 77.5 ("AB"*300+"CA").find("CA") (*1000) 0.55 0.49 112.5 "BC" in ("AB"*300+"C") (*1000) 0.66 0.37 177.7 "BC" in ("AB"*300+"C") (*1000) 0.45 0.58 76.5 ("AB"*300+"C").index("BC") (*1000) 0.57 0.49 116.5 ("AB"*300+"C").index("BC") (*1000) 0.61 0.62 98.6 ("AB"*300+"C").partition("BC") (*1000) 0.72 0.52 137.2 ("AB"*300+"C").partition("BC") (*1000) 0.62 0.64 96.4 ("C"+"AB"*300).rfind("CA") (*1000) 0.49 0.49 101.6 ("C"+"AB"*300).rfind("CA") (*1000) 0.57 0.65 87.5 ("BC"+"AB"*300).rfind("BC") (*1000) 0.51 0.57 89.3 ("BC"+"AB"*300).rfind("BC") (*1000) 0.62 0.64 96.5 ("C"+"AB"*300).rindex("CA") (*1000) 0.50 0.49 101.2 ("C"+"AB"*300).rindex("CA") (*1000) 0.68 0.69 99.0 ("C"+"AB"*300).rpartition("CA") (*1000) 0.61 0.54 113.5 ("C"+"AB"*300).rpartition("CA") (*1000) 0.82 0.60 137.8 ("C"+"AB"*300).rsplit("CA", 1) (*1000) 0.63 0.57 112.0 ("C"+"AB"*300).rsplit("CA", 1) (*1000) 0.63 0.61 103.0 ("AB"*300+"C").split("BC", 1) (*1000) 0.74 0.54 138.2 ("AB"*300+"C").split("BC", 1) (*1000) ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41972> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com