I wrote: > 0001 puts the main implementation of pg_utf_mblen() into an inline > function and uses this in pg_mblen(). This is somewhat faster in the > strpos tests, so that gives some measure of the speedup expected for > other callers. Text search seems to call this a lot, so this might > have noticeable benefit. > > 0002 refactors text_position_get_match_pos() to use > pg_mbstrlen_with_len(). This itself is significantly faster when > combined with 0001, likely because the latter can inline the call to > pg_mblen(). The intention is to speed up more than just text_position. > > 0003 explicitly specializes for the inline version of pg_utf_mblen() > into pg_mbstrlen_with_len(), but turns out to be almost as slow as > master for ascii. It doesn't help if I undo the previous change in > pg_mblen(), and I haven't investigated why yet. > > 0002 looks good now, but the experience with 0003 makes me hesitant to > propose this seriously until I can figure out what's going on there. > > The test is as earlier, a worst-case substring search, times in milliseconds. > > patch | no match | ascii | multibyte > --------+----------+-------+----------- > PG11 | 1220 | 1220 | 1150 > master | 385 | 2420 | 1980 > 0001 | 390 | 2180 | 1670 > 0002 | 389 | 1330 | 1100 > 0003 | 391 | 2100 | 1360
I tried this test on a newer CPU, and 0003 had no regression. Both systems used gcc 11.2. Rather than try to figure out why an experiment had unexpected behavior, I plan to test 0001 and 0002 on a couple different compilers/architectures and call it a day. It's also worth noting that 0002 by itself seemed to be decently faster on the newer machine, but not as fast as 0001 and 0002 together. Looking at the assembly, pg_mblen is inlined into pg_mbstrlen_[with_len] and pg_mbcliplen, so the specialization for utf-8 in 0001 would be inlined in the other 3 as well. That's only a few bytes, so I think it's fine. -- John Naylor EDB: http://www.enterprisedb.com