On Fri, 2024-11-01 at 14:08 +0100, Andreas Karlsson wrote: > > Agreed -- a lot of work has gone into optimizing the regex code, > > and we > > don't want a perf regression there. But I'm also not sure exactly > > which > > kinds of tests I should be running for that. > > I think we should at least try to find the worst case to see how big > the > performance hit for that is. And then after that try to figure out a > more typical case benchmark.
What I had in mind was: * a large table with a single ~100KiB text field * a scan with a case insensitive regex that uses some character classes Does that sound like a worst case? > The painful part was mostly just a reference to that without a > catalog > table where new providers can be added we would need to add > collations > for our new custom provider on some already existing provider and > then > do for example some pattern matching on the name of the new > collation. > Really ugly but works. To add a catalog table for the locale providers, the main challenge is around the database default collation and, relatedly, initdb. Do you have some ideas around that? Regards, Jeff Davis