On Sat, Sep 19, 2020 at 04:09:27PM -0700, Mark Dilger wrote: > I am marking this ready for committer. I didn't object to the > whitespace weirdness in your patch (about which `git apply` > grumbles) since you seem to have done that intentionally. I have no > further comments on the performance issue, since I don't have any > other platforms at hand to test it on. Whichever committer picks > this up can decide if the issue matters to them enough to punt it > back for further performance testing.
About 0001, the new set of multipliers looks fine to me. Even if this adds an extra item from 901 to 902 because this can be divided by 17 in kwlist_d.h, I also don't think that this is really much bothering and. As mentioned, this impacts none of the other tables that are much smaller in size, on top of coming back to normal once a new keyword will be added. Being able to generate perfect hash functions for much larger sets is a nice property to have. While on it, I also looked at the assembly code with gcc -O2 for keywords.c & co and I have not spotted any huge difference. So I'd like to apply this first if there are no objections. I have tested 0002 and 0003, that had better be merged together at the end, and I can see performance improvements with MSVC and gcc similar to what is being reported upthread, with 20~30% gains for simple data sample using IS NFC/NFKC. That's cool. Including unicode_normprops_table.h in what gets ignored with pgindent is also fine at the end, even with the changes to make the output of the structures generated more in-line with what pgindent generates. One tiny comment I have is that I would have added an extra comment in the unicode header generated to document the set of structures generated for the perfect hash, but that's easy enough to add. -- Michael
signature.asc
Description: PGP signature