Re: Change GUC hashtable to use simplehash?

2024-04-06 Thread John Naylor
On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > > But given that we know the data length and we have it in a register > already, it's easy enough to just mask out data past the end with a > shift. See patch 1. Performance benefit is about 1.5x Measured on a > small test harness that just

Re: Change GUC hashtable to use simplehash?

2024-04-01 Thread John Naylor
On Tue, Mar 5, 2024 at 5:30 PM John Naylor wrote: > > On Tue, Jan 30, 2024 at 5:04 PM John Naylor wrote: > > > > On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > > > But given that we know the data length and we have it in a register > > > already, it's easy enough to just mask out data past

Re: Change GUC hashtable to use simplehash?

2024-03-30 Thread John Naylor
On Thu, Mar 28, 2024 at 12:37 PM Jeff Davis wrote: > > > v21-0003 adds a new file hashfn_unstable.c for convenience functions > > and converts all the duplicate frontend uses of hash_string_pointer. > > Why not make hash_string() inline, too? I'm fine with it either way, > I'm just curious why

Re: Change GUC hashtable to use simplehash?

2024-03-27 Thread Jeff Davis
On Wed, 2024-03-27 at 13:44 +0700, John Naylor wrote: > Thanks for the pointers! In v20-0001, I've drafted checking thes > spelling first, since pg_attribute_no_sanitize_alignment has a > similar > version check. Then it checks for no_sanitize_address using > __has_attribute, which goes back to

Re: Change GUC hashtable to use simplehash?

2024-03-27 Thread John Naylor
On Wed, Mar 20, 2024 at 11:01 PM Jeff Davis wrote: > > > I found that adding __attribute__((no_sanitize_address)) to > > fasthash_accum_cstring_aligned() passes CI. While this kind of > > exception is warned against (for good reason), I think it's fine here > > given that glibc and NetBSD, and

Re: Change GUC hashtable to use simplehash?

2024-03-20 Thread Jeff Davis
On Wed, 2024-03-20 at 14:26 +0700, John Naylor wrote: > This was the culprit. The search path cache didn't trigger this when > it went in, but it seems for frontend a read past the end of malloc > fails -fsantize=address. By the same token, I'm guessing the only > reason this didn't fail for

Re: Change GUC hashtable to use simplehash?

2024-03-20 Thread John Naylor
On Tue, Mar 5, 2024 at 5:30 PM John Naylor wrote: > > On Tue, Jan 30, 2024 at 5:04 PM John Naylor wrote: > > > > On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > > > But given that we know the data length and we have it in a register > > > already, it's easy enough to just mask out data past

Re: Change GUC hashtable to use simplehash?

2024-03-05 Thread John Naylor
On Tue, Jan 30, 2024 at 5:04 PM John Naylor wrote: > > On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > > But given that we know the data length and we have it in a register > > already, it's easy enough to just mask out data past the end with a > > shift. See patch 1. Performance benefit is

Re: Change GUC hashtable to use simplehash?

2024-02-07 Thread John Naylor
On Wed, Feb 7, 2024 at 10:41 PM Peter Eisentraut wrote: > > /tmp/cirrus-ci-build/src/include/common/hashfn_unstable.h: In function > ‘int fasthash_accum_cstring_unaligned(fasthash_state*, const char*)’: > /tmp/cirrus-ci-build/src/include/common/hashfn_unstable.h:201:20: > warning: comparison of

Re: Change GUC hashtable to use simplehash?

2024-02-07 Thread Peter Eisentraut
On 22.01.24 03:03, John Naylor wrote: I wrote: fasthash_init(, sizeof(Datum), kind); fasthash_accum(, (char *) , sizeof(Datum)); return fasthash_final32(, 0); It occurred to me that it's strange to have two places that length can be passed. That was a side effect of the original,

Re: Change GUC hashtable to use simplehash?

2024-02-06 Thread John Naylor
I wrote: > > It occurred to me that it's strange to have two places that length can > be passed. That was a side effect of the original, which used length > to both know how many bytes to read, and to modify the internal seed. > With the incremental API, it doesn't make sense to pass the length

Re: Change GUC hashtable to use simplehash?

2024-02-02 Thread John Naylor
On Tue, Jan 30, 2024 at 7:51 PM Ants Aasma wrote: > > It didn't calculate the same result because the if (mask) condition > was incorrect. Changed it to if (chunk & 0xFF) and removed the right > shift from the mask. Yes, you're quite right. > It seems to be half a nanosecond faster, but as I >

Re: Change GUC hashtable to use simplehash?

2024-01-30 Thread Ants Aasma
On Tue, 30 Jan 2024 at 12:04, John Naylor wrote: > > On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > > But given that we know the data length and we have it in a register > > already, it's easy enough to just mask out data past the end with a > > shift. See patch 1. Performance benefit is

Re: Change GUC hashtable to use simplehash?

2024-01-30 Thread John Naylor
On Tue, Jan 30, 2024 at 4:13 AM Ants Aasma wrote: > But given that we know the data length and we have it in a register > already, it's easy enough to just mask out data past the end with a > shift. See patch 1. Performance benefit is about 1.5x Measured on a > small test harness that just hashes

Re: Change GUC hashtable to use simplehash?

2024-01-29 Thread Ants Aasma
On Sun, 21 Jan 2024 at 03:06, Jeff Davis wrote: > Yes, thank you. I don't think we need to change the algorithm. Jumping in here at a random point just to share my findings from poking around this on and off. I am concentrating here on cstring hashing as that is the most complicated one. One

Re: Change GUC hashtable to use simplehash?

2024-01-21 Thread Jeff Davis
On Mon, 2024-01-22 at 09:03 +0700, John Naylor wrote: > v15-0004 is a stab at that. As an idea, it also renames zero_bytes_le > to zero_byte_low to reflect the effect better. There might be some > other comment edits needed to explain usage, so I plan to hold on to > this for later. Let me know

Re: Change GUC hashtable to use simplehash?

2024-01-21 Thread John Naylor
I wrote: > fasthash_init(, sizeof(Datum), kind); > fasthash_accum(, (char *) , sizeof(Datum)); > return fasthash_final32(, 0); It occurred to me that it's strange to have two places that length can be passed. That was a side effect of the original, which used length to both know how many

Re: Change GUC hashtable to use simplehash?

2024-01-20 Thread Jeff Davis
On Sat, 2024-01-20 at 13:48 +0700, John Naylor wrote: > The above identity is not true for this haszero64 macro. I see. > I hope this makes it more clear. Maybe the comment could use some > work. Yes, thank you. I don't think we need to change the algorithm. After having stepped away from this

Re: Change GUC hashtable to use simplehash?

2024-01-19 Thread John Naylor
On Sat, Jan 20, 2024 at 7:13 AM Jeff Davis wrote: > > On Fri, 2024-01-19 at 13:38 -0800, Jeff Davis wrote: > > One post-commit question on 0aba255440: why do > > haszero64(pg_bswap64(chunk)) rather than just haszero64(chunk)? How > > does byteswapping affect whether a zero byte exists or not? > >

Re: Change GUC hashtable to use simplehash?

2024-01-19 Thread John Naylor
On Fri, Jan 19, 2024 at 11:54 PM Heikki Linnakangas wrote: > Thanks! I started to look at how to use this, and I have some questions. > I'd like to replace this murmurhash ussage in resowner.c with this: > > > /* > >* Most resource kinds store a pointer in 'value', and pointers are

Re: Change GUC hashtable to use simplehash?

2024-01-19 Thread Jeff Davis
On Fri, 2024-01-19 at 13:38 -0800, Jeff Davis wrote: > One post-commit question on 0aba255440: why do > haszero64(pg_bswap64(chunk)) rather than just haszero64(chunk)? How > does byteswapping affect whether a zero byte exists or not? I missed that it was used later when finding the rightmost one

Re: Change GUC hashtable to use simplehash?

2024-01-19 Thread Jeff Davis
On Fri, 2024-01-19 at 14:27 +0700, John Naylor wrote: > Pushed that way, thanks! Thank you. One post-commit question on 0aba255440: why do haszero64(pg_bswap64(chunk)) rather than just haszero64(chunk)? How does byteswapping affect whether a zero byte exists or not? Regards, Jeff Davis

Re: Change GUC hashtable to use simplehash?

2024-01-19 Thread Heikki Linnakangas
On 19/01/2024 09:27, John Naylor wrote: Pushed that way, thanks! After fixing another typo in big endian builds, an s390x member reported green, so I think that aspect is working now. I'll come back to follow-up topics shortly. Thanks! I started to look at how to use this, and I have some

Re: Change GUC hashtable to use simplehash?

2024-01-18 Thread John Naylor
On Wed, Jan 17, 2024 at 9:54 PM Heikki Linnakangas wrote: > Maybe something like: > > " > Building blocks for creating fast inlineable hash functions. The > functions in this file are not guaranteed to be stable between versions, > and may differ by hardware platform. Hence they must not be used

Re: Change GUC hashtable to use simplehash?

2024-01-17 Thread Heikki Linnakangas
On 17/01/2024 09:15, John Naylor wrote: /* * hashfn_unstable.h * * Building blocks for creating fast inlineable hash functions. The * unstable designation is in contrast to hashfn.h, which cannot break * compatibility because hashes can be written to disk and so must produce * the same

Re: Change GUC hashtable to use simplehash?

2024-01-16 Thread John Naylor
I spent some time rewriting the comments and a couple other cosmetic changes, and squashed into two patches: the second one has the optimized string hashing. They each have still just one demo use case. It looks pretty close to commitable, but I'll leave this up for a few days in case anyone wants

Re: Change GUC hashtable to use simplehash?

2024-01-08 Thread John Naylor
On Mon, Jan 8, 2024 at 2:24 PM Junwang Zhao wrote: > > + * Portions Copyright (c) 2018-2023, PostgreSQL Global Development Group > > A kind reminder, it's already 2024 :) > > I'm also curious why the 2018, is there any convention for that? The convention I followed was "blind copy-paste", but

Re: Change GUC hashtable to use simplehash?

2024-01-07 Thread Junwang Zhao
Hi John, On Mon, Jan 8, 2024 at 10:44 AM John Naylor wrote: > > On Sat, Jan 6, 2024 at 9:01 AM jian he wrote: > > > > latency average = 147.782 ms > > select * from bench_cstring_hash_unaligned(10); > > latency average = 101.179 ms > > select * from bench_cstring_hash_aligned(10); > >

Re: Change GUC hashtable to use simplehash?

2024-01-07 Thread John Naylor
On Sat, Jan 6, 2024 at 9:01 AM jian he wrote: > > latency average = 147.782 ms > select * from bench_cstring_hash_unaligned(10); > latency average = 101.179 ms > select * from bench_cstring_hash_aligned(10); > latency average = 101.219 ms Thanks for testing again! This looks closer to my

Re: Change GUC hashtable to use simplehash?

2024-01-05 Thread jian he
On Sat, Jan 6, 2024 at 9:04 AM John Naylor wrote: > > On Fri, Jan 5, 2024 at 6:58 PM jian he wrote: > > -Dcassert=true \ > > > -Dbuildtype=debug \ > > These probably don't matter much for this test, but these should be > off for any performance testing. > > >

Re: Change GUC hashtable to use simplehash?

2024-01-05 Thread John Naylor
On Fri, Jan 5, 2024 at 6:58 PM jian he wrote: > -Dcassert=true \ > -Dbuildtype=debug \ These probably don't matter much for this test, but these should be off for any performance testing. > -DWRITE_READ_PARSE_PLAN_TREES > -DCOPY_PARSE_PLAN_TREES

Re: Change GUC hashtable to use simplehash?

2024-01-05 Thread jian he
On Fri, Jan 5, 2024 at 6:54 PM John Naylor wrote: > > On Thu, Jan 4, 2024 at 10:01 AM jian he wrote: > > > > I still cannot git apply your patch cleanly. in > > I don't know why you're using that -- the git apply man page even says > > "Use git-am(1) to create commits from patches generated by >

Re: Change GUC hashtable to use simplehash?

2024-01-05 Thread John Naylor
On Thu, Jan 4, 2024 at 10:01 AM jian he wrote: > > I still cannot git apply your patch cleanly. in I don't know why you're using that -- the git apply man page even says "Use git-am(1) to create commits from patches generated by git-format-patch(1) and/or received by email." Or, if that fails,

Re: Change GUC hashtable to use simplehash?

2024-01-03 Thread jian he
On Wed, Jan 3, 2024 at 10:12 PM John Naylor wrote: > > On Tue, Jan 2, 2024 at 6:56 AM jian he wrote: > > > > My local computer is slow. but here is the test results: > > > > select * from bench_cstring_hash_aligned(10);7318.893 ms > > select * from

Re: Change GUC hashtable to use simplehash?

2024-01-03 Thread John Naylor
On Tue, Jan 2, 2024 at 6:56 AM jian he wrote: > > My local computer is slow. but here is the test results: > > select * from bench_cstring_hash_aligned(10);7318.893 ms > select * from bench_cstring_hash_unaligned(10);10383.173 ms > select * from bench_pgstat_hash(10);

Re: Change GUC hashtable to use simplehash?

2024-01-01 Thread jian he
On Tue, Dec 26, 2023 at 4:01 PM John Naylor wrote: > > 0001-0003 are same as earlier > 0004 takes Jeff's idea and adds in an optimization from NetBSD's > strlen (I said OpenBSD earlier, but it goes back further). I added > stub code to simulate big-endian when requested at compile time, but a >

Re: Change GUC hashtable to use simplehash?

2023-12-26 Thread John Naylor
On Wed, Dec 20, 2023 at 1:48 PM John Naylor wrote: > > On Wed, Dec 20, 2023 at 3:23 AM Jeff Davis wrote: > > > > The reason I looked here is that the inner while statement (to find the > > chunk size) looked out of place and possibly slow, and there's a > > bitwise trick we can use instead. > >

Re: Change GUC hashtable to use simplehash?

2023-12-19 Thread John Naylor
On Wed, Dec 20, 2023 at 3:23 AM Jeff Davis wrote: > > On Tue, 2023-12-19 at 16:23 +0700, John Naylor wrote: > > That wasn't the next place I thought to look (that would be the > > strcmp > > call), but something like this could be worthwhile. > > The reason I looked here is that the inner while

Re: Change GUC hashtable to use simplehash?

2023-12-19 Thread Jeff Davis
On Tue, 2023-12-19 at 16:23 +0700, John Naylor wrote: > That wasn't the next place I thought to look (that would be the > strcmp > call), but something like this could be worthwhile. The reason I looked here is that the inner while statement (to find the chunk size) looked out of place and

Re: Change GUC hashtable to use simplehash?

2023-12-19 Thread John Naylor
On Tue, Dec 19, 2023 at 2:32 PM Jeff Davis wrote: > > On Mon, 2023-12-18 at 13:39 +0700, John Naylor wrote: > > For now just two: > > v10-0002 is Jeff's change to the search path cache, but with the > > chunked interface that I found to be faster. > > Did you consider specializing for the case of

Re: Change GUC hashtable to use simplehash?

2023-12-18 Thread Jeff Davis
On Mon, 2023-12-18 at 13:39 +0700, John Naylor wrote: > For now just two: > v10-0002 is Jeff's change to the search path cache, but with the > chunked interface that I found to be faster. Did you consider specializing for the case of an aligned pointer? If it's a string (c string or byte string)

Re: Change GUC hashtable to use simplehash?

2023-12-17 Thread John Naylor
I wrote: > Updated next steps: > * Add some desperately needed explanatory comments. There is a draft of this in v10-0001. I also removed the validation scaffolding and ran pgindent. This could use some review and/or bikeshedding, in particular on the name hashfn_unstable.h. I also considered

Re: Change GUC hashtable to use simplehash?

2023-12-14 Thread John Naylor
I wrote: > > * v8 with chunked interface: > latency average = 555.688 ms > > This starts to improve things for me. > > * v8 with chunked, and return lower 32 bits of full 64-bit hash: > latency average = 556.324 ms > > This is within the noise level. There doesn't seem to be much downside > of

Re: Change GUC hashtable to use simplehash?

2023-12-11 Thread John Naylor
On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis wrote: > > On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote: > > > I tested using the new hash function APIs for my search path cache, > > > and > > > there's a significant speedup for cases not benefiting from > > > a86c61c9ee. > > > It's enough

Re: Change GUC hashtable to use simplehash?

2023-12-10 Thread John Naylor
I wrote: > On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis wrote: > > > > On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote: > > > > I tested using the new hash function APIs for my search path cache, > > > > and > > > > there's a significant speedup for cases not benefiting from > > > >

Re: Change GUC hashtable to use simplehash?

2023-12-09 Thread John Naylor
On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis wrote: > > On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote: > > > I tested using the new hash function APIs for my search path cache, > > > and > > > there's a significant speedup for cases not benefiting from > > > a86c61c9ee. > > > It's enough

Re: Change GUC hashtable to use simplehash?

2023-12-09 Thread Jeff Davis
On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote: > > I tested using the new hash function APIs for my search path cache, > > and > > there's a significant speedup for cases not benefiting from > > a86c61c9ee. > > It's enough that we almost don't need a86c61c9ee. So a definite +1 > > to > >

Re: Change GUC hashtable to use simplehash?

2023-12-09 Thread John Naylor
On Sat, Dec 9, 2023 at 3:32 AM Jeff Davis wrote: > > On Wed, 2023-11-29 at 20:31 +0700, John Naylor wrote: > > Attached is a rough start with Andres's earlier ideas, to get > > something concrete out there. > > The implementation of string hash in 0004 forgot to increment 'buf'. Yeah, that was

Re: Change GUC hashtable to use simplehash?

2023-12-08 Thread Jeff Davis
I committed 867dd2dc87, which means my use case for a fast GUC hash table (quickly setting proconfigs) is now solved. Andres mentioned that it could still be useful to reduce overhead in a few other places: https://postgr.es/m/20231117220830.t6sb7di6h6am4...@awork3.anarazel.de How should we

Re: Change GUC hashtable to use simplehash?

2023-12-08 Thread Jeff Davis
On Wed, 2023-11-29 at 20:31 +0700, John Naylor wrote: > Attached is a rough start with Andres's earlier ideas, to get > something concrete out there. The implementation of string hash in 0004 forgot to increment 'buf'. I tested using the new hash function APIs for my search path cache, and

Re: Change GUC hashtable to use simplehash?

2023-12-06 Thread John Naylor
On Wed, Dec 6, 2023 at 11:48 PM Jeff Davis wrote: > > On Wed, 2023-12-06 at 07:39 +0700, John Naylor wrote: > > "git grep cstring_hash" found nothing, so not sure what you're > > asking. > > Sorry, I meant string_hash(). Your v5-0002 changes the way hashing > works for cstrings, and that means

Re: Change GUC hashtable to use simplehash?

2023-12-06 Thread Jeff Davis
On Wed, 2023-12-06 at 07:39 +0700, John Naylor wrote: > "git grep cstring_hash" found nothing, so not sure what you're > asking. Sorry, I meant string_hash(). Your v5-0002 changes the way hashing works for cstrings, and that means it's no longer equivalent to hash_bytes with strlen. That's

Re: Change GUC hashtable to use simplehash?

2023-12-05 Thread John Naylor
On Tue, Dec 5, 2023 at 1:57 AM Jeff Davis wrote: > > On Mon, 2023-12-04 at 12:12 +0700, John Naylor wrote: > There's already a patch to use simplehash, and the API is a bit > cleaner, and there's a minor performance improvement. It seems fairly > non-controversial -- should I just proceed with

Re: Change GUC hashtable to use simplehash?

2023-12-04 Thread Jeff Davis
On Mon, 2023-12-04 at 12:12 +0700, John Naylor wrote: > That's a good thing to clear up. This thread has taken simplehash as > a > starting point from the very beginning. It initially showed no > improvement, and then we identified problems with the hashing and > equality computations. The latter

Re: Change GUC hashtable to use simplehash?

2023-12-03 Thread John Naylor
On Mon, Dec 4, 2023 at 4:16 AM Jeff Davis wrote: > I'm trying to follow the distinctions you're making between dynahash > and simplehash -- are you saying it's easier to do incremental hashing > with dynahash, and if so, why? That's a good thing to clear up. This thread has taken simplehash as a

Re: Change GUC hashtable to use simplehash?

2023-12-03 Thread Jeff Davis
On Wed, 2023-11-29 at 20:31 +0700, John Naylor wrote: > v5-0001 puts fash-hash as-is into a new header, named in a way to > convey in-memory use e.g. hash tables. > > v5-0002 does the minimal to allow dynash to use this for string_hash, > inlined but still calling strlen. > > v5-0003 shows one

Re: Change GUC hashtable to use simplehash?

2023-12-02 Thread John Naylor
On Wed, Nov 29, 2023 at 8:31 PM John Naylor wrote: > > Attached is a rough start with Andres's earlier ideas, to get > something concrete out there. While looking at the assembly out of curiosity, I found a couple bugs in the split API that I've fixed locally. I think the path forward is: -

Re: Change GUC hashtable to use simplehash?

2023-11-29 Thread John Naylor
On Wed, Nov 29, 2023 at 9:59 PM Heikki Linnakangas wrote: > > I didn't understand what you meant by the above. Did you wack around > fast-hash, or who did? I turned it into an init/accum/final style (shouldn't affect the result), and took out the input length from the calculation (will affect

Re: Change GUC hashtable to use simplehash?

2023-11-29 Thread Heikki Linnakangas
On 29/11/2023 15:31, John Naylor wrote: However, I did find a couple hash functions that are much simpler to adapt to a bytewise interface, pass SMHasher, and are decently fast on short inputs: - fast-hash, MIT licensed, and apparently has some use in software [1] - MX3, CC0 license (looking

Re: Change GUC hashtable to use simplehash?

2023-11-29 Thread John Naylor
Attached is a rough start with Andres's earlier ideas, to get something concrete out there. I took a look around at other implementations a bit. Many modern hash functions use MUM-style hashing, which typically uses 128-bit arithmetic. Even if they already have an incremental interface and have a

Re: Change GUC hashtable to use simplehash?

2023-11-23 Thread John Naylor
On Thu, Nov 23, 2023 at 5:34 AM Andres Freund wrote: > It's worth noting that the limited range of the input values means that > there's a lot of bias toward some bits being set ('a' to 'z' all start with > 0b011). We can take advantage of the limited range with a single additional instruction:

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread Andres Freund
Hi, On 2023-11-22 16:27:56 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-11-22 15:56:21 -0500, Tom Lane wrote: > >> GUC names are just about always short, though, so I'm not sure you've > >> made your point? > > > With short I meant <= 6 characters (32 / 5 = 6.x). After that you're

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread Tom Lane
Andres Freund writes: > On 2023-11-22 15:56:21 -0500, Tom Lane wrote: >> GUC names are just about always short, though, so I'm not sure you've >> made your point? > With short I meant <= 6 characters (32 / 5 = 6.x). After that you're > overwriting bits that you previously set, without dispersing

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread Andres Freund
Hi, On 2023-11-22 15:56:21 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2023-11-21 16:42:55 +0700, John Naylor wrote: > >> The strlen call required for hashbytes() is not free. The lack of > >> mixing in the (probably inlined after 0001) previous hash function can > >> remedied

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread Tom Lane
Andres Freund writes: > On 2023-11-21 16:42:55 +0700, John Naylor wrote: >> The strlen call required for hashbytes() is not free. The lack of >> mixing in the (probably inlined after 0001) previous hash function can >> remedied directly, as in the attached: > I doubt this is a good hashfunction.

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread Andres Freund
Hi, On 2023-11-21 16:42:55 +0700, John Naylor wrote: > I get a noticeable regression in 0002, though, and I think I see why: > > guc_name_hash(const char *name) > { > - uint32 result = 0; > + const unsigned char *bytes = (const unsigned char *)name; > + int blen =

Re: Change GUC hashtable to use simplehash?

2023-11-22 Thread John Naylor
I wrote: > Thinking some more, I'm not quite comfortable with the number of > places in these patches that have to know about the pre-downcased > strings, or whether we need that in the first place. If lower case is > common enough to optimize for, it seems the equality function can just > check

Re: Change GUC hashtable to use simplehash?

2023-11-21 Thread John Naylor
On Wed, Nov 22, 2023 at 12:00 AM Jeff Davis wrote: > > On Tue, 2023-11-21 at 16:42 +0700, John Naylor wrote: > > The strlen call required for hashbytes() is not free. > > Should we have a hash_string() that's like hash_bytes() but checks for > the NUL terminator itself? > > That wouldn't be

Re: Change GUC hashtable to use simplehash?

2023-11-21 Thread Jeff Davis
On Tue, 2023-11-21 at 16:42 +0700, John Naylor wrote: > The strlen call required for hashbytes() is not free. Should we have a hash_string() that's like hash_bytes() but checks for the NUL terminator itself? That wouldn't be inlinable, but it would save on the strlen() call. It might benefit

Re: Change GUC hashtable to use simplehash?

2023-11-21 Thread John Naylor
On Mon, Nov 20, 2023 at 5:54 AM Jeff Davis wrote: > > Attached are a bunch of tiny patches and some perf numbers based on > simple test described here: > > https://www.postgresql.org/message-id/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com I tried taking I/O out, like this,

Re: Change GUC hashtable to use simplehash?

2023-11-19 Thread Jeff Davis
Hi, On Fri, 2023-11-17 at 16:10 -0800, Andres Freund wrote: > > The requested name is already case-folded in most contexts. We can > > do a > > lookup first, and if that fails, case-fold and try again. I'll hack > > up > > a patch -- I believe that would be measurable for the proconfigs. > >

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Andres Freund
Hi, On 2023-11-17 16:01:31 -0800, Jeff Davis wrote: > On Fri, 2023-11-17 at 15:27 -0800, Andres Freund wrote: > > At > > first I thought that that's largely because you aren't using > > SH_STORE_HASH. > > I might want to use that in the search_path cache, then. The lookup > wasn't showing up much

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
On Fri, 2023-11-17 at 15:27 -0800, Andres Freund wrote: > At > first I thought that that's largely because you aren't using > SH_STORE_HASH. I might want to use that in the search_path cache, then. The lookup wasn't showing up much in the profile the last I checked, but I'll take a second look.

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Andres Freund
Hi, On 2023-11-17 14:08:56 -0800, Jeff Davis wrote: > On Fri, 2023-11-17 at 17:04 -0500, Tom Lane wrote: > > I can't imagine wanting to convert *every* hashtable in the system > > to simplehash; the added code bloat would be unreasonable.  So yeah, > > I think we'll have two mechanisms

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
On Fri, 2023-11-17 at 14:08 -0800, Andres Freund wrote: > I think this would be a completely fair thing to port over - whether > it's > worth it I don't quite know, but I'd not be against it on principle > or such. Right now I don't think it offers much. I'll see if I can solve the case-folding

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Andres Freund
Hi, On 2023-11-17 17:04:04 -0500, Tom Lane wrote: > Jeff Davis writes: > > On Fri, 2023-11-17 at 13:22 -0800, Gurjeet Singh wrote: > >> But your argument of a nicer API might make a case for the patch. > > > Yeah, that's what I was thinking. simplehash is newer and has a nicer > > API, so if we

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
On Fri, 2023-11-17 at 17:04 -0500, Tom Lane wrote: > I can't imagine wanting to convert *every* hashtable in the system > to simplehash; the added code bloat would be unreasonable.  So yeah, > I think we'll have two mechanisms indefinitely.  That's not to say > that we might not rewrite hsearch. 

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Andres Freund
Hi, On 2023-11-17 13:44:21 -0800, Jeff Davis wrote: > On Fri, 2023-11-17 at 13:22 -0800, Gurjeet Singh wrote: > > This is not a comment on the patch itself, but since GUC operations > > are not typically considered performance or space sensitive, I don't think that's quite right - we have a lot

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Tom Lane
Jeff Davis writes: > On Fri, 2023-11-17 at 13:22 -0800, Gurjeet Singh wrote: >> But your argument of a nicer API might make a case for the patch. > Yeah, that's what I was thinking. simplehash is newer and has a nicer > API, so if we like it and want to move more code over, this is one > step.

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
On Fri, 2023-11-17 at 13:22 -0800, Gurjeet Singh wrote: > This is not a comment on the patch itself, but since GUC operations > are not typically considered performance or space sensitive, A "SET search_path" clause on a CREATE FUNCTION is a case for better performance in guc.c, because it

Re: Change GUC hashtable to use simplehash?

2023-11-17 Thread Gurjeet Singh
On Fri, Nov 17, 2023 at 11:02 AM Jeff Davis wrote: > > I had briefly experimented changing the hash table in guc.c to use > simplehash. It didn't offer any measurable speedup, but the API is > slightly nicer. > > I thought I'd post the patch in case others thought this was a good > direction or

Change GUC hashtable to use simplehash?

2023-11-17 Thread Jeff Davis
I had briefly experimented changing the hash table in guc.c to use simplehash. It didn't offer any measurable speedup, but the API is slightly nicer. I thought I'd post the patch in case others thought this was a good direction or nice cleanup. -- Jeff Davis PostgreSQL Contributor Team - AWS